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INTRODUCTION 


‘T believe that there is a need for a consistency proof for Zermelo-Fraenkel 
set theory. What is more, I think that the search for such a proof is’ a 
mathematically feasible project. I am not speaking here of a finitary con- 
sistency proof of the sort which Gentzen gave for number theory— 
although, of course, one cannot rule out the possibility of such a proof. 
No, what I have in mind is a proof based on other, less restrictive principles. 

Tf one looks at the existing proofs of the consistency of number theory, 
one notices that they all take the correctness of the intuitive mathematical 
principles underlying the formalised axioms for granted. And this appliés, 
in particular, to the induction axiom. What these proofs attempt to justify 
is the embodiment of those intuitively valid principles in the classical (or 
indeed, in the intuitionistic) first otder predicate calculus, What I am 
proposing is a search for an analogous justification for the embodiment of 
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the intuitive principles of Zermelo—Fraenkel set theory in the first order 
predicate calculus. In particular, I believe that the employment of formulas 
involving unrestricted quantification over all sets in the axiom schemas of 
replacement and comprehension must be analysed, so that mathematical 
reasons can be given for the belief that no contradiction can be derived 
from the formalised versions of these principles» 

But what is the point of view with respect to which this justification is 
to take place? What sort of reassurance can we reasonably expect? It is 
obvious that no consistency proof can be absolute. Each such proof must 
rest upon the assumption that the means of argument which it itself 
employs are legitimate. The value of such a proof therefore depends 
directly upon the conceptual gap between the modes of argument and 
definition employed in the proof, and those justified by it. Clearly the 
question which must be askéd as regards the consistency problem for 
Zermelo-Fraenkel set theory is whether there is a plausible standpoint 
with respect to which the consistency of the theory is genuinely proble- 
matic, but which is not, at the same time, so severely restrictive with regard. 
to permitted modes of argument as to preclude all reasonable hope of 
actually producing a consistency proof. 

My principal aim is to show that such a standpoint exists, and that it is 
to be reached by taking a unified approach to the foundations of set theory 
and classical predicate logic. Indeed, I am convinced that this unified 
approach is the only coherent approach to those foundations, quite apart 
from any connection it has with the consistency problem. Moreover, it can 
be regarded, as I shall explain, simply as a sharp formulation of the point 


„of view which the vast majority of present day mathematicians hold, either 


consciously or unconsciously, concerning the basic principles of set theory. 

T have divided this essay into three sections. The first two are alternative 
routes leading to the same goal—a unified system of classical predicate 
logic and general set theory. In section 1, this unified theory appears as a 
solution to practical foundational problems—problems which arise when 
one attempts to formalise set theory in the usual way as a classical first order 
theory having an intended, or standard, interpretation. In section 2, the 
unified theory is derived from Cantor’s doctrine of the absolute infinity of 
the realm of sets. This brings out an important analogy between the 
classical Cantorian point of view-in set theory and the traditional .finitist 
position associated with Hilbert’s programme. And this analogy leads to a 
sharp formulation of the consistency problem for set theory. Section 3 
deals with the natural axioms of strong infinity—the axioms postulating the 
existence of strongly inaccessible numbers, and the Mahlo axiomg. I 
subject the informal arguments which attempt to establish these axioms 
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as natural and inevitable extensions of the basic principles of Zermelo- 
Fraenkel to a thorough and searching analysis. It is quite necessary for my 
overall purpose that I show where those informab arguments break down. 
For if those arguments were to prove substantially correct, then the con- 
sistency problem for set theory, as I have envisaged it, would collapse. 

Sections 1 and 2 are both more or less self contdined. In any event, each 
of them can be considered apart from its connections with the consistency 
problem. Section x, in particular, should be read as a sustained argument 
for an alternative to the currently accepted approach to the foundations of 
set theory and classical predicate logic. 

Some of the topics I shall take up may at first seem somewhat novel. But 
initial impressions may prove deceptive here. For, in fact, my whole dis- 
cussion derives, in one way or another, from Gédel’s two fundamental 
articles dealing with the foundations of classical mathematics ([1947] and 
[1944]). Indeed, I take the point of view expounded in his [1947] as my 
starting point. This is not to say that I accept all the arguments that Gödel 
presents there. On the contrary, I take issue with him on two important 
points: the scope of classical logic and the status of the natural axioms of 
strong infinity. But the problems discussed, particularly those in sections 1 
and 3, are problems which were first raised in that article. (Gédel’s views 
concerning the scope of classical logic are also discussed in Kreisel [1965] 
and in Wang [1974].) In section 2, I have, in addition, drawn on the 
philosophical writings of Cantor (especially ‘Uber die verschiedenen 
Standpunkte in bezug auf das aktuelle Unendliche’ reprinted in Cantor 
[1966] pp. 370-439) and those of Gentzen (especially the introduction to 
“The Consistency of Elementary Number Theory’ reprinted in Gentzen 


[1969] pp. 132-213). 
I SET THEORY AND CLASSICAL LOGIC 


x.x The conventional view of classical logic. 

Some logicians of an earlier generation hoped that the theory of sets, and 
thereby all of mathematics, could be incorporated into a single, universal 
system of mathematical logic. Such ideas are no longer in fashion. Logic 
and set theory have long since gone their separate ways. And yet, even the 
most fanatical of formalists must admit that set theory does, after all, have 
a special role to play in the foundations of classical logic. The semantical 
ideas of satisfaction, truth, validity, etc., are, at the very least, of con- 
siderable heuristic value in motivating formal systems of logical proof. 
They are, of course, much more than that: What is more, these semantical 
ideas are essentially set-theoretieal in character. 


4 John Mayberry 


The special role that sét theory plays in the fqundations of classical logic 
confronts us with a serious dilemma. Which of these theories are we to 
regard as the more fundamental? Like any other mathematical theory, set 
theory must be formalised. Its basic assumptions (axioms) must be set out, 
and its underlying logic must be made precise. This calls for classical 
predicate logic. On the other hand, for classical predicate logic the 
semantical notions are absolutely fundamental. They, and they alone, can 
justify the logical axioms and rules employed in formal proofs. Thus 
classical predicate logic presupposes the theory of sets. 

The conventional response to this dilemma is to acknowledge—or, at 
any rate, to employ—two set theories. There is the official theory, a 
formalised, axiomatic theory, whose underlying logic is the classical pre- 
dicate calculus. Then there is the unofficial theory, informal, intuitive— 
naive, if you will. Those who use this unofficial theory are often not even 
conscious of doing so. Nevertheless, it makes its presence felt in various 
ways, some of them harmless, some of them decidedly not. 

But the most serious shortcoming of this conventional attempt to escape 
the dilemma is that it blocks any fundamental effort to understand the real 
connection between set theory and classical logic. And this is not merely a 
parochial matter, of interest only to logicians and set-theorists. On the 
contrary, it raises issues of serious practical concern to all mathematicians, 
issues which go to the heart of the present day practice of mathematics. 

The fundamental cause of the confusion over the role of set theory is the 
currently prevailing view of first order classical logic—I shall refer to it as 
the orthodox or conventional view. Its essential ingredients are two simple 
conventions. One concerns how to formalise classical predicate logic, the 
other, how to apply it. According to these conventions, first order logic 
should be formalised in the usual way as a logic of predicates, propositional 
connectives, and quantifiers. It should be applied to provide the underlying 
logic of any intuitively given mathematical theory whatsoever. 

The conventional view cuts cleanly across party lines. Whoever accepts 
it—and that includes almost everyone—sees the principal function of 
predicate logic as that of providing the formally precise underlying logic 
for intuitively given theories. He may take his mathematical intuitions 
seriously (if he is a platonist) or dismiss them as of psychological or socio- 
logical significance only (if he is a conventionalist or formalist). He will, in 
any case, formalise them in first order languages of the conventional sort. 

None of this is at all surprising,.for it is very difficult to see the con- 
ventional view as a point of view at all. It simply seems to be the way one 
does things. A method rather than a doctrine. 

And it is here that we confront the-most sérious obstacle to logical 
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reform. The conventional yiew, considered as a method, has an impressive 
record of solid technical achievements to its credit—a whole series of 
profound and beautiful theorems, requiring a large body of ingenious, 
mathematically sophisticated methods for their proofs. How could there 
be anything fundamentally wrong here? Surely if something is wrong with 
the conventional view, then a radical reformulation of all of contemporary 
mathematics—say, along the lines advocated by the intuitionists—is called 
for. And this is a prospect which few mathematicians can view with 
equanimity. : 

I am very much aware of these objections, and I shall answer them all in 
due course. Rather surprisingly, it turns out that everything can be put 
right by a simple, one might even say trivial, observation. But before I 
produce the cure, I must describe the disease. I shall begin by pointing out 
the harmful effects of the conventional view’ on the foundations of set 
theory itself. Then I shall trace the consequences of these effects in the 
foundations of the theory of categories. Finally, I shall briefly indicate the 
way out of these difficulties. This will lead naturally to the discussion, taken 
up in section 2, of the problem of infinity in the foundations of classical 
mathematics. l 


1.2 The consequences of the conventional view of classical logic for the 
foundations of set theory. 

Let us recall briefly why it is that classical first order logic requires set 
theory for its foundations. It is certainly possible to formulate classical logic 
in a purely combinatorial way, as a system of formal proofs proceeding 
from logical axioms by means of rules of logical inference. In this way, one 
can replace the primitive semantical notion of a formula’s being logically 
valid, i.e. true under all possible realisations, with the purely combinatorial 
one of a formula’s being formally provable. But the theoretical justification 
of this replacement can only be provided by the Completeness Theorem, 
a theorem which requires set-theoretically formulated semantical notions 
both for its statement and its proof. It is the Completeness Theorem alone 
which supplies the entire raison d’etre of a system of formal proofs. Without 
it, no one can say why anybody should be interested in the theorems of 
this paxticular theory, or why the consequences of precisely these axioms and 
rules should be identified with the set of logical truths. 

Some formalists, like Bourbaki, argue that formalising logical proof 
merely makes precise the informal means of argument that mathematicians, 
as a matter of sociological fact, accept. I do not think such arguments carry 
much conviction. But in any event, arguments of that kind can only justify 
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a system locally (i.e. justify each of its axioms and rules individually) and 
cannoteven begin the much more difficult and important task of the global 
justification of the system of axioms and rules taken as a whole. One could 
perhaps argue that the axioms and rules of a system are intuitively correct. 
But could one argue that they are intuitively complete? 

Given this special rolt of set theory in the foundations of classical pre- 
dicate logic, it follows that an axiomatic definition of set makes no sense. 
Any attempt to give such a definition would obviously be circular. And 
once we have accepted that a formal axiomatic foundation for set theory 
simply is not a possibility, we can begin to appreciate the really formidable 
difficulties inherent in the conventional view. 

Let us consider the problem of providing a semantics for first order 
classical logic. On the conventional view, first order logic is not a single 
system, but a collection of distinct, if related, systems, each determined by 
a formal first order language of the traditional sort. Now the formal 
expressions of such a system acquire meanings only when the system is 
provided with an interpretation or realisation. Such a realisation is deter- 
mined by specifying a universe of discourse, over which the quantified 
variables of the language range, together with a suitable interpretation, 
relative to that universe, of the individual, functional, and predicate para- 
meters of the language. The key question is thus: what universes of discourse 
are to be acknowledged? Clearly it is the duty of set theory to provide an 
answer. But this is where the conventional view of classical logic begins to 
work serious mischief. Conventionally, set theory is seen as an intuitive 
theory like, say, Euclidean geometry. As such, it must be formalised to 
make it precise; that is, it must be given a formal underlying logic. And 
this formal underlying logic can only be the classical first order predicate 
calculus. After all, is not classical logic universally applicable? Set theory 
thus comes to be regarded as a formal axiomatic theory with an associated 
standard interpretation. But the usual arguments easily demonstrate that 
the universe of discourse of this standard interpretation cannot be a set, in 
the sense of that standard interpretation. And this means that the very 
attempt to formalise naive set theory in the conventional way has caused that 
theory to fall short of complete generality. Furthermore, it is obvious that 
this is going to happen no matter how the notion of set is analysed. There 
is simply no way out of this difficulty—as long, that is, as the conventional 
view of classical logic holds sway. 

This state of affairs has quite serious consequences. The careful analysis 
of the notion of set, which led to the intuitive, pre-formalised version of 
Zermelo—Fraenkel set theory, has been brought to nothing. The obscure 
notion of arbitrary collection must now replace the clear one of well-founded 
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set. And if we ignore mere conventions of terminology, we can see that we 
are back where we started. For we are still confronted with the, task of 
formulating a general theory of sets adequate to account for the informal 
use of sets in ordinary mathematics. 

The conventional point of view has thus lumbered itself with two set 
theories: a formal theory of sets, ‘properly’ so-called, and an informal 
theory of arbitrary collections, including and extending the formal theory.? 
As we shall see, both of these theories are plagued with foundational 
difficulties—difficulties, moreover, which cannot be confined to set theory 
proper, but which spill over into important areas of ordinary mathematics. 

The informal theory is in much the worse condition of the two. Its 
difficulties are well-known, indeed notorious. What can be said about those 
collections which do not already fall within the scope of the formal theory? 
One supposes that these more general collecttons are still extensional, that 
is that no two distinct such collections can have exactly the same members. 
But then one is immediately up against the embarrassing fact that one 
knows nothing further about this more general notion of collection, apart 
from possessing a single example in the universe of formal set theory— 
though, of course, other examples can be manufactured from this one. 
Beyond these rather barren observations, one is simply not able to claim 
anything with much conviction. Indeed, the situation is even more serious 
than this might suggest. The danger of contradiction confronts us wherever 
we turn. Is there a collection of all collections, for example? (If not, why, 
then, is there a collection of all sets?) And what about the collection of all 
those collections which are not members of themselves? Or the collection 
of all well-ordered collections? Or that of all those collections which are 
well-founded with respect to the membership relation? All these ghosts 
from the past—from the prehistory of set theory and mathematical logic— 
are arising to trouble us yet again. These are difficulties which simply 
disappear on the Zermelo—Fraenkel analysis. Unfortunately, however, that 
analysis proves to be insufficiently general—at least, so long as we stick to 
the conventional formulation of Zermelo—Fraenkel as a first order theory 
of the traditional sort. 

Of course the foundational problems for the formalised set theory—the 
1] believe that Gödel was the first to notice this, I base this belief on his remark that 


[the set theoretical paradoxes] are a very serious problem, not for mathematics, 
however, but for logic and epistemology. ({1947], p. 262) 


I take him to mean that what I am calling here the “informal” or ‘“‘unofficial” set theory 
is the theory for which those paradoxes constitute a problem, and that this theory is not 
directly relevant to mathematics for the excellent reasons he goes on to give. In a footnote 
added for the 1964 reprinting, he acknowledged, in effect, that the theory of categories 
seems to be encroaching pon this wider domain. I shall discuss this latter point at 
considerable length in section 1.3. » 
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‘official’ theory—are much less severe. Still, even here the conventional 
view of classical logic produces distortions—distortions which are perhaps 
the more serious for being subtle in their effects, The distortions to which 
I am referring here are produced by the vexed and difficult notion of the 
proper class. 

I have already remarked how the illusion of the need for a more general 
notion of collection arises out of the mere act of formalising Zermelo— 
Fraenkel set theory in the conventional way. This gives us the proper class- 
of ‘all’ sets from, as it were, outside the formal theory of sets. But the notion 
of proper class, as is well known, plays a much more intimate role in the 
development of set theory than this suggests. Proper classes are not merely 
entities which must be acknowledged only to be ignored. They actually 
play an important role inside set theory. They are, in practice, thought of as 
constituting part of the furfiture of the theory, whatever may be their 
legalistic status on the official level of axioms and primitive notions. 
Consider, for example, how one actually thinks of the principle of definition 
by transfinite induction in the usual formulation of Zermelo—Fraenkel 
which is officially class free. One is invariably led to formulate and to prove 
this result in terms of classes.1 And the formal apparatus of the contextual 
definition of class abstraction is really just so much window dressing. This 
hypostatisation of the proper class stems directly from the intimate in- 
tuitive connection between proper classes and quantification over the 
supposed universe of all sets. 

Let us examine this connection more closely. The extensional picture of 
proper classes as ‘oversized’ sets arises directly out of the application of 
quantifiers obeying classical logic to the ‘universe’ of sets. This happens in 
the following way: by using classical quantifiers, class abstracts—ex- 
pressions of the form {x | $(x)} (‘the class of all x such that ¢(x)’}—can be 
introduced, together with—and here is the fundamentally important point 
—together with an extensional equality relation by means of which one can 
assert the extensional identity of two such class abstracts—or rather, of 
their denotations. And this is precisely the point. In the presence of this 
relation of extensional equality (i.e. {x | ¢(x)} = {x | ¢{x)} <a Vx[¢(x) = 
y(x)]) class abstracts assume the appearance of proper names whose 
denotations can only be the proper classes themselves. Of course it is still 
possible, formally, to introduce class abstracts, even in the absence of 


1 I do not wish to suggest that this use of the proper class is entirely misguided. Employed 
in this way, proper classes serve as surrogates for functionals, and the notion of functional 
(i.e. globally defined map) is a perfectly legitimate, indeed a centrally important, primitive 
notion of set theory. Functionals, however, unlike proper classes, can on no account be 
treated extensionally, e.g. as “large” collections of ordered pairs. I shall discuss this 
notion of functional, and its central role in the foundations of set theory, in section 2.2. 
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quantifiers over the ‘collection’ of all sets. But no one is going to be 
tempted, in those circumstances, to treat them as names for things. For 
then there would be no way of asserting identity between class abstracts, 
indeed, no criterion of ‘identity and distinctness at all—no principle of 
individuation for proper classes. And where there is no individuation there 
can be no question of individuals. z 

Notice that the argument here depends crucially on the assumption that 
the quantifiers in question obey classical logical laws. Even were quantifiers 
over all sets obeying inttitionistic laws to be made available, there would 
still be no machinery for introducing proper classes in the extensional sense. 
In those circumstances, the law of ‘extensionality’, by means of which 
equality between class abstracts is defined, takes on an entirely different, 
more intensional, meaning, owing to the difference between the classical 
and intuitionistic meanings of the logical syntbols. Formally, the failure of 
extensionality would be reflected in the undectdability of the resulting 
equality relation. That is to say, the law of the excluded middle in the form 
S = TV S+ T would not hold, in general, for class abstracts S and T. 

Thus we need full-blown classical quantification in order to introduce 
proper classes. On the other hand, if we have proper classes, with class 
abstraction and a decidable equality relation, we can easily give contextual 
definitions of the classical quantifiers. In this strictly formal sense, then, 
the two notions are equivalent. But the connection between them goes 
deeper than this. For when classical quantifiers are present, there is an 
almost irresistible temptation to think in terms of proper classes, especially 
in the presence of axiom schemata, like Replacement, which are naively 
understood as second order axioms about proper classes. Thus when 
classical quantification is available, the whole proper class Weltanschauung 
forces itself upon us. There is strong empirical evidence for this in the 
upward ontological drift exhibited in the passage from Zermelo-Fraenkel 
set theory (in which class abstracts appear) through Bernays-Gédel set 
theory (in which the basic objects are now classes, and classical quantifica- 
tion over the ‘totality’ of classes is now allowed) to Morse set theory (in 
which not only is quantification over classes permitted, but also impre- 
dicative definitions of classes by means of conditions in which such class 
quantifiers appear). 

But granted that the notion of proper class has dubious antecedents, 
granted that it is produced as a byproduct of a particular approach to the 
formalisation of first order logic, is it not still possible that this notion has 
merit in its own right? Might we not, indeed, reverse the original process, 
and justify the conventional formulation of set theory using the notion of 
proper class? These are obviously points to be answered if my case against 
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the conventional view is going to hold up. I will, consider the role of this 
notion ip set theory itself first, and save for later a discussion of how it 
enters into the ordinary practice of mathematics, 

Let us look at the set-theoretical role of proper-classes, then. The really 
central difficulty here is that of explaining the nature of the difference 
between proper classes and mere sets. There daes not appear to be any 
intelligible internal or structural difference between them which would 
allow us to say ‘Here the process of set formation must stop, for now we 
have reached a proper class.’ Proper classes are-collections; so are sets. 
Proper classes are well-founded with respect to the membership relation; 
so are sets. Proper classes are extensional: they are completely determined 
by their members so that if two of them are distinct, they must differ over 
the membership of some element. The same, of course, applies to sets. 

Notice that the problem òf distinguishing proper classes from sets is 
made much more difficult in classical set theory by the presence of trans- 
finite sets. If such sets were outlawed, if all sets were hereditarily finite, 
then it would be possible to take the familiar property, which in classical 
set theory serves to characterise transfinite sets, as the required internal 
property of proper classes which would allow them to be distinguished 
from sets. But, of course, this option is no longer available once transfinite 
sets have been postulated (or even just not ruled out). 

What about the traditional explanation that proper classes, unlike sets, 
cannot be members of further collections? Well, in the first place it is 
doubtful that anyone takes this explanation seriously. It certainly does 
serve to distinguish proper classes from sets in the particular contexts of 
certain formal systems of set theory. But most of those who would offer 
such an explanation (in the course of describing the intended interpretation 
of Bernays—Gédel class-set theory, for example) would cheerfully admit 
that it would not apply to higher order formalisations of set theory. Set 
theory with classes is, after all, just second order set theory. And there is 
certainly nothing in the conventional view to suggest that set theory could 
not be formulated in classical predicate languages of arbitrarily high, 
indeed even of transfinite, order. One might even say that considerations of 
uniformity in classical logic demand that whatever could serve as the 
universe of discourse of a second order language could serve as the universe 
of discourse for any higher order classical language whatsoever. 

And anyway, quite apart from these logical considerations, once having 
admitted proper classes as objects, clearly individuated by means of un- 
ambiguous identity conditions, what can we put forward as an argument 
against accepting collections of such objects? After all, since we can dis- 
tinguish proper classes, we can presumahly count them, which strongly 
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suggests that we should be able to form them into aggregates or collections. 
Surely the idea of a definite thing which cannot be put into collections with 
other things is completely unintelligible. But what is now to stop us from 
forming collections of classes, families of collections of classes, aggregates 
of families of collections of classes, and so on up through a new transfinite 
hierarchy, which has the same closure properties for its ‘ordinals’ as were 
possessed by those of the original (actual?) universe. Having gone this far, 
why not introduce ‘proper classes’ over this new universe and begin the 
whole business over again... and soon... . There is no obvious stopping 
point for such a process. The notion of class can now be seen to be a special 
case of a much more general notion. But what are we now to make of the 
claim that the original universe was the universe of sets? And if we cannot 
make sense of this claim, was not the original universe just a set all along? 
And, if so, what has happened to the notion of proper class? 

The principles of set construction embodied in the Zermelo—Fraenkel 
axioms seem natural and inevitable to present day mathematicians. Indeed, 
the conviction that those constructions should be universally applicable is 
absolutely fundamental to the classical outlook in mathematics. This is 
perhaps the single most important principle among those which make up 
that outlook. It follows that the idea of a kind of collection to which it is not 
possible to apply all of these principles of construction is an inherently 
unstable one. It is thus in the very nature of things that the notion of 
proper class should give rise to a more general notion which does allow for 
the universal applicability of these principles, and therefore satisfies the 
Zermelo-Fraenkel closure conditions.1 But then the attempt to treat this 
more general notion formally, in accordance with the conventional approach 
to classical logic, leads back to the notion of proper class, however this time 
at a ‘higher level’. We thus have what appears to be a dialectical process, in 
which larger and larger collections are generated out of the conflict between 
two unavoidable but seemingly incompatible principles: the principle of 
the universal applicability of the basic operations of set theory, and the 
principle of the universal applicability of classical logic. 

Actually, as we shall see, there is no real conflict here. There is only the 
appearance of one, caused by the conventional view of classical logic. The 
set-theorist in the grip of that conventional view is like a man caught in a 
revalving door. By delaying his exit in order to take just that one critical 
additional step, he is forced to whirl round on yet another circuit. And when 
he finally decides—if he ever doeg—to omit the extra step, and thereby 


1This demand for uniformity reflects a deep and indeed indispensable principle of 
classical mathematics. In particular, it makes jt possible for mathematicians to employ 
set theory informally, and in a completely unselfconscious way, in definitions and proofs, 
I shall elaborate on this point in s&ction 2.3. 
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successfully to negotiate his exit, he will find himself none the better off 
for all of those extra trips round. 

The supposed dialectic js just an illusion. But jt seems very real for all 
that. It is a deep-seated illusion. And the trouble it causes cannot be 
confined to set theory. 


1.3 The consequences of the conventional view of classical logic for mathe- 
matics in general. : 

So far we have considered the harmful effects of the conventional view of 
classical logic only in so far as they are manifested in the foundations of set 
theory. But since set theory plays so central a role in the axiomatic method, 
it is impossible to confine these harmful effects to set theory alone. Indeéd, 
wherever the axiomatic method is used, there is always the danger that 
these difficulties will show themselves. There is, however, one theory in 
which these set-theoretical problems are especially conspicuous. Moreover, 
this theory is so widely applicable and so centrally important, that the 
presence in it of such problems is a matter of practical concern to all 
mathematicians, I am referring here to the theory of categories. 

In category theory we have, for the first time, an algebraic theory of 
sufficient generality and power to be capable of organising the whole of 
pure mathematics. The arrow diagrams and the elementary vocabulary of 
the theory are already in standard use, and even more substantial encroach- 
ments into mathematical practice have occurred. Moreover, the funda- 
mental notions of category theory—natural transformation, universal arrow, 
adjoint, efc—are clearly destined to become the central notions of modern 
algebra, if, indeed, they have not already achieved that status. And all of 
this means that problems in the foundations of category theory will have 
repercussions throughout the whole of mathematics. 

Like other algebraic structures, categories are determined axiomatically, 
so that difficulties in the foundations of set theory are potential sources of 
trouble. But category theory is much more vulnerable to these difficulties 
than other algebraic theories, because it raises, in a quite unprecedented 
way, questions about the global structure of mathematics. 

Such questions of global structure are regarded with considerable unease 
and suspicion by the conservative majority of mathematicians. And this is 
as it should be. For there is an important sense in which such questions 
violate the essential spirit of modern classical mathematics. But it is to the 
considerable credit of the category theorists that they have raised them, 
because in facing these global issues we are forced to look seriously at the 
problems in the foundations of set theory oecasioned by the conventional 
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view of classical logic. For it is at the global level in set theory that these 
problems arise. : 

Let us look at the account of the foundations of category theory given by 
Maclane in his Categories for the Working Mathematician ([1971]). Here we 
find a perfect illustration of the two set theories doctrine which I discussed 
earlier. Maclane finds it necessary to begin his treatment of category theory 
by distinguishing categories proper, the collection of whose arrows must 
form a set in the sense of Zermelo-Fraenkel, from meta-categories, which 
satisfy exactly the same“axioms, but whose arrows need not form a collec- 
tion in this ‘restricted’ sense, and may constitute any collection whatsoever. 

Obviously this distinction between categories and meta-categories pre- 

supposes a prior set-theoretical distinction between sets proper and collec- 
tions in general, a distinction, it will be recalled, which the conventional 
view of logic alone forces upon us. Notice, however, that a similar distinc- 
tion could equally well be made between groups proper and meta-groups, 
rings proper and meta-rings, etc. Why is it, then, that no one ever feels 
called upon to make these latter distinctions? Why is it only category 
theory which requires that such a distinction be made. The answer lies in 
the generality of its application. Category theory is unique in attempting 
an algebraic treatment of problems of global structure. In particular, there 
is nothing in group theory or ring theory analogous to the concrete meta- 
categories of category theory. It is via these structures that global issues are 
introduced into the theory. And it is here that the shortcomings of the 
conventional treatment of set theory become visible for all to see. In 
Categorical Algebra and Set Theoretical Foundations ([1971]) Maclane gives 
an admirably succinct account of the matter: š 
Categorical algebra has developed in recent years as an effective method of 
organising parts of mathematics, Typically, this sort of organisation uses notions 
such as that of the category G of all groups. This category consists of two 
collections: The collection of all groups G and the collection of all homo- 
morphisms ¢: G —> H from one group G into another one; the basic operation 
in this category is the composition of two such homomorphisms. To realise the 
intent of this construction it is vital that this collection G contain all groups. 
However, if collection is to mean ‘set’ in any one of the usual axiomatisations, 
this intent cannot be realised. 
Of course, on the conventional view of set theory, meta-categories are 
‘there’ in any event. We are stuck with them—and with meta-groups, meta- 
rings, etc. But only in category theory do such structures become visible. 
And the passage just quoted suggests why. It is by trying to treat whole 
algebraic theories as structures, that concrete meta-categories (like the meta- 
category of all rings, efc.), and hence meta-categories in general, come under 
consideration. : 
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Notice that there are really two complaints te be levelled against the 
conventional formulation of set theory arising out of these considerations. 
The first, and most basic, is that we are saddled with these meta-structures 
(both concrete and otherwise) which fall outside thie compass of our formal 
set theory. What is the use of a ‘general’ set theory which appears to fall 
so grievously short of fufl generality? This complaint has really nothing 
directly to do with category theory. The latter only serves to make us more 
vividly aware that something is amiss. The second complaint here is that 
the conventional formulation of set theory does rfot allow us to form the 
all inclusive collections (e.g. the collection of all groups or that of all rings) 
which go to make up the all inclusive concrete meta-categories under dis- 
cussion. It is especially important that we keep these two complaints 
separate. For, as we shall soon see, the first is a legitimate indictment ofa 
mistaken doctrine, whereas the second actually arises out of that mistaken 
doctrine itself. 

But let us continue, now, with our investigation of Maclane’s foundational 
arrangements for category theory. For we have not yet got the full story. 
The conventional formulation of set.theory has not yet exhausted its 
capacity to make mischief. It has still further distinctions to impose upon 
the theory—distinctions to be drawn, this time, among the categories 
proper themselves. Inside his official set theory—which satisfies the 
Zermelo-Fraenkel axioms—Maclane postulates the existence of a set U 
(for universe) which itself satisfies the second order versions of the Zermelo- 
Fraenkel axioms. He then defines a small category to be a category, the set 
of whose arrows is a member of U, and a large category to be a category, the 
set of whose arrows is a subset of U. 

Notice the complications that the conventional view of set theory and 
classical logic has imposed. We have a surrogate universe of sets U, with 
its own surrogate proper classes, inside the actual, or official, universe of 
sets, which, in its turn, is only a single collection among the totality of all 
collections, taken in the most general or arbitrary sense. And, correspond- 
ing to these gradations in the ontological hierarchy, we have small 
categories, large categories, categories proper and meta-categories, 
respectively. All of them satisfy exactly the same axioms. They differ only 
in the kinds of collections formed by their arrows. It is obvious that some- 
thing must be wrong here. Surely things cannot be that complicated..And 
yet, the foundational arrangements that Maclane has made here are 
probably the best that anyone can come up with and still conform to the 
conventional view of set theory and classical logic. In his earlier treatment 
(see MacLane and Birkhoff [1967], ch. XV) he had used the Bernays— 
Gödel system for his official set theory. In that approach the large—small 
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distinction for categories corresponds to the class—set distinction in the 
underlying official set theory. However this approach via Bernays-Gédel 
is cumbersome, and compares very unfavourably-with the later treatment. 
The effect of his later approach is to close out the domain of large categories 
under the operations embodied in the Zermelo-Fraenkel axioms. In this 
way he allows himself sufficient elbow room to carry out arguments and 
constructions involving large categories in a straight-forward and unself- 
conscious way, a way which avoids constant reference to the formal details 
of the underlying formal Set theory. There can be no doubt that this shows 
sound mathematical judgment. But it does rather appear to have destroyed 
the whole point of the large-small distinction, which did at least make a 
kind of sense in the earlier formulation. Surely in a Zermelo-Fraenkel 
universe, every set is small. However, we cannot, in all conscience, saddle 
MacLane with the responsibility for a faulty formulation of set theory, 
one, moreover, which seems inevitable given the universally accepted way 
of looking at classical predicate logic. 

MacLane himself makes it clear that he is not happy with the founda- 
tional arrangements he has made: 


Our foundation by means of one universe does provide, within set theory, an 
accurate way of discussing the category of all small sets and all small groups, but 
it does not provide sets to represent certain meta-categories such as the meta- 
category of all sets or that of all groups. ([1971], p. 24) 


Here is that point about those all inclusive concrete meta-categories again. 
This complaint has become the focus of all of MacLane’s grievances 
against conventional set theory. But what is this meta-category of all 
groups that he seems so concerned about? And what is the significance of 
that italicised all? No doubt it is to indicate that really or absolutely all 
groups are intended. But has he got anything mathematically significant in 
mind, which would distinguish this category of all groups from any 
category of groups satisfying appropriate closure conditions. Does this 
huge category really serve any purpose other than that of relieving us of the 
duty of including such closure conditions in the statements of our theorems? 
Is the category of all groups closed under arbitrary direct products, for 
example? If so, what about the product 


IG| Ger} 
over the collection, I’, of all groups, then? Are we really expected to take 
such a monstrosity seriously, as a group? Could it be used to provide a 
counter-example to any interesting conjecture, for example? And if it 
coyld be, would any group theorist take notice? 
Observe that on the Zermelo*Fraenkel analysis of set, such all inclusive 
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categories cannot exist. It is only because Zermelo—Fraenkel set theory 
appears to fall short of complete generality, that such categories seem to be 
genuine possibilities. And it only appears to fall short of full generality by 
virtue of its formalisation in conformity with the conventional view of 
classical logic. Or, in any event, its formalisation in that way is sufficient 
to make it appear to fall*short of complete generelity. That this defect can 
be remedied still remains to be shown. 

But even if we discount, as I am convinced we must, this complaint that 
no adequate treatment of these supposed all-inclusive concrete meta- 
categories is to be had, the case against conventionally formulated set 
theory remains a formidable one. The question is: what is to be done 
about it? Can we simply get rid of set theory? Well, that would be much 
more difficult than the category theorists seem to realise. For example, in 
the sequel to the passage quoted above MacLane writes ([1971], p. 24): 

.. there has been considerable discussion of a foundation for category theory 
{and for all mathematics) not based on set theory. 


But he means only formalised set theory, the official theory, here, not the 
intuitive set theory that provides the semantics for axiomatic theories. This 
is clear from the next sentence: 

This is why we initially gave the definition in a set-free form, simply regarding 
the axioms as first order axioms on undefined terms ‘object of C’, ‘arrow of C’, 
‘composite’, ‘identity’, ‘domain’, and ‘codomain’. 

Of course he intends the mieta-categories to be models of these axioms, and 
these models are structures of the unofficial, intuitive set theory, (Notice 
how this theory is simply taken for granted.) 

In this style, Lawvere ([1964]) has given axioms for the elementary (i.e. first 
order) theory of the category of all sets, as an alternative to the usual axioms for 
membership. 


But, one does not have to look very deeply into Lawvere’s [1964] treatment 
of the category of sets, or of his [1966] treatment of the category of 
categories, to see that the idea of denying intuitive set theory its function 
in the semantics of the axiomatic method has simply never entered his 
head.} And, of course, one would have to deny it that function in order to 


1 In his [1966], after giving first order axioms for categories he writes: 

By a category we of course understand (intuitively) any structure which is ah inter- 
pretation of the elementary theory of abstract categories, and by a functor we under- 
stand (intuitively) any triple consisting of two categories and a rule T which assigns 
to any morphism x of the first category, a unique morphism yT of the second 
category in such a way that. 

We are obviously dealing with the e unofficial set theory here. It is not clear, thopgh, 

whether the insertions of the word “intuitively” senclosed in parenthesis are intended to 
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mount a serious attempt to displace set theory from its role in the founda- 
tions of modern mathematics. 

Here we have come to.a vitally important subject, a subject of central 
importance for the foundations of mathematics. We may all agree that the 
theory of categories has a fundamentally important role to play in 
mathematics. But what 4s the nature of that rôle? Is it a foundational 
theory, a rival to the theory of sets? Or, alternatively, is it a non- 
foundational theory which has introduced considerations into mathematics 
which transcend the scope of ordinary set theoretical foundations, and thus 
demand some entirely new, as yet undiscovered foundational theory? Given 
the part category theory already plays in mathematics, these are questions 
that require answering. But even from within category theory itself, the 
direction in which the theory develops may depend on the answers to 
these questions. For these answers may suggest which problems are 
interesting and important, and which new developments are (or are not) 
worth pursuing. 

For set theory too, these are vital questions. What we have here is the 
first serious challenge to set-theoretical foundations since the ‘foundations 
crisis’ occasioned by the discovery of the paradoxes over three quarters of 
a century ago. It would obviously be foolish to embark upon a programme 
of reform for the theory of sets, if that theory has already become obsolete. 
Clearly it will be well worth our while to consider these issues at some 


length. 


1.4 The Category theorists’ case against set theory. 


We have just seen how the set-theoretical difficulties engendered” by the 
conventional view of classical logic give rise to foundational problems in 
the theory of categories. One simply cannot deny that the category theorists 
have a legitimate complaint against set theory in its conventional formula- 
tion. But many category theorists have complaints that go far beyond these 
obviously valid ones, complaints which if justified would require a complete 
revaluation of the foundational role of set theory. And these complaints 
deserve, at least, the courtesy of a careful consideration of their merits. 

One of the standard charges levelled against set theory is that its basic 
concepts and assumptions are remote from everyday mathematical practice. 
Lawvere, for instance, in his [1966] envisages a new foundational theory 
which 





be explanatory or dismissive—probably a little bit of both. Perhaps he hopes, by this 
simple rhetorical device, to disguise the incursion of set theory into his account at this 
crucial point. But the incurgion does occur, and with fatal consequences for the point of 
view which he is trying to defend. ° 


B 
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would seemingly [sic] be much more natural and readily-usable than the classical 
one [ie. set theory] when developing such subjects as algebraic topology, 
functional analysis, model theory of general algebraic systems, etc. 


And MacLane remarks that category theory 
matches very effectively the sort of structure actually investigated in Mathematics. 


and then goes on to observe that 
This match is probably much better than that, given in set theory, since elaborate 
sets, with elements of elements of elements,...are remote from the actual 
central concerns of Mathematics. (MacLane [1968])° 

These observations are basically correct, but the argumentis misdirected, 
because it ignores the actual use to which set theory is put as a foundational 
theory. The function of set theory in the foundations of mathematics is a 
logical one. It is essentially a theory of definitions and arguments. It 
provides the raw materials (i.e. sets, functions, ordered pairs, etc.) and 
formal techniques for the definition of mathematical structures and, 
through the unpacking of these definitions, the ultimate principles upon 
which mathematical argument rests. But because it is concerned with the 
logical foundations, rather than with the organisation, of mathematics, it 
does not have anything at all to say concerning what definitions ought to be 
made, or which structures, among the a priori possible ones, might prove 
to be of mathematical interest. 

Given that set theory has this logical role to play in the foundations of 
mathematics, that its function is to show us how to construct precise 
definitions and rigorous proofs, it follows that the virtues of the theory are 
to be sought in the simplicity, naturalness, and parsimony of its primitive 
notions, and in the intuitive plausibility of its basic assumptions. It is 
certainly no reproach to such a theory to say that its primitive notions are 
remote from the practical ends which they must ultimately be called upon 
to serve. On the contrary, the gap between the means and the ends of such 
a theory is a good measure of its utility. For deep, and often surprising, 
results must rest ultimately upon a solid basis of intuitively plausible, 
simple assumptions. 

Notice that even MacLane’s specific criticism of the irrelevance of the 
membership structure of sets of high rank misfires. Of course it is true that 
such sets (whose elements have elements which, in turn, have elements, 
etc.) are not of much mathematical interest, when considered as com- 
plicated orderings under the membership relation, for example. But this 
is certainly not the point behind ‘the construction of the cumulative 
hierarchy. The point is that these sets are obtained simply by the iteration 
‘of universally accepted principles of set constructiqn; and that in producing 
sets of high rank we also produce sets of high cardinality. And this is of 
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interest. Some of the déepest results in general set theory show that 
questions of the existence of certain types of mathematical structure are 
closely related to questions concerning the existence of sets of high 
cardinality (the Measure Problem provides a good example here). 

An analogous instance comes to mind here. The actual practice of 
mathematical analysis is*not directly concerned’ with the details of the 
Dedekind—Russell construction of a model of the axioms for a complete 
ordered field out of sets of rational numbers. Since all models of these 
axioms are isomorphic, one is as good as another for the development of 
analysis. But the fact that a model for these axioms can be constructed from 
the set of natural numbers using elementary set-theoretical operations ts of 
considerable theoretical interest. It may very well be that anyone (of the 
appropriate age and sex) can play the part of Fortinbras (since it is not very 
demanding). It would then follow that nobody in particular need be chosen 
for the part. But it certainly does not follow that nobody need be chosen 
at all. In the same way, the particular nature of the elements of a 
mathematical structure may not matter, but the existence of such a structure 
may be crucial. And, of course, any structure is going to.be composed of 
elements having their own individuating peculiarities. 

- This brings me naturally to another objection commonly raised against 
set theoretical foundations. This is the objection that set theory obscures 
the central importance of abstract form by introducing distracting and 
irrelevant considerations of the individual natures of the elements that go 
to make up mathematical structures. One often hears it said’ that the old 
fashioned ‘set theoretical’ approach to structures encouraged mathema- 
ticians to view them in terms of their internal morphology, whereas the 
new, category-theoretical way of looking at them in terms of their relations 
with other structures of the same sort (i.e. in terms of the morphisms of the 
appropriate category) is much more fruitful. In the introduction to his 
[1966] paper on the category of categories Lawvere makes this point: 
In the mathematical development of recent decades one sees clearly the rise of 
the conviction that the relevant properties of mathematical objects are those 
which can be stated in terms of their abstract structure, rather than in terms of 
the elements which the objects were thought to be made of. The question thus 
naturally arises whether one can give a foundation for mathematics which 
expresses wholeheartedly this conviction concerning what mathematics is about, 
and in’particular in which classes and membership in classes do not play any role. 
Here we have, I believe, a statement with which many mathematicians will 
instinctively sympathise. It will appeal especially to those who have 
1 And, as Gödel pointed out more than thirty years ago ([1947], p. 264), the existence of 
sets of large cardinality has important consequences even in the theory of Diophantine 
equations. : 
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accustomed themselves to thinking about mathematical structures in 
category-theoretic terms. Moreover, the general criticism of set theory 
mounted here will gain eonsiderably in plausibility by seeming to rest on 
the obvious fact that category theory provides a fundamentally new way of 
looking at mathematical structures. Nevertheless, and despite its apparent 
plausibility, this criticism of set theoretical foundations is totally mis- 
conceived. It is based on a superficial view of the notions of abstraction and 
abstract form, of the roles that these notions play in modern mathematics, 
and of the function of set theory and the axiomattc method in determining 
those roles. 

I intend to pursue this matter at some length. The confusion that 
surrounds these notions of. abstraction and abstract form is seriously 
detrimental to any attempt to assess the role of set theory in the foundations 
of mathematics. But we must not underestimate the difficulties to be en- 
countered here. What we are up against here involves, after all, the old 
problem of universals, reformulated in modern terms. 


1.5 The role of abstraction in mathematics. 


That mathematics is highly abstract, and that it is concerned with form 
rather than with content, are two propositions to which nearly everyone 
would give immediate assent. They have achieved the status of cliches. Yet 
both are, at best, highly misleading. And without considerable qualification 
they are quite simply false. What is more, they lead to a false view of the 
role of set theory and the axiomatic method in modern mathematics. 

Modern mathematics is not abstract in any interesting sense of that 
word. The mathematics that it replaced—the traditional mathematics— 
that mathematics was abstract. It dealt with idealised shapes and motions 
obtained by abstraction from experience. But this point of view has long 
been abandoned. The genius of modern mathematics consists in the 
avoidance of abstraction, even, indeed, especially in those circumstances in 
which it seems unavoidable. And the means employed in avoiding it are 
set theory and the axiomatic method. 

Now by avoiding abstraction I mean, in the first instance, avoiding the 
postulation of abstract particulars. And by an abstract particular I mean an 
(ostensible) individual object (putatively) obtained from another object, or 
set of objects, by abstraction. Of course this is not adequate as a rigorous 
definition, but it will serve to indicate the sort of thing I have in mind: 
points, lines, planes, geometrical figures of various sorts (these are obtained 
by abstraction from static spatial intuition), geometrical curves and surfaces 
(generated by ideal motions—these are abstracted from kinematic spatial 
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intuition), and finally, numbers (both ordinal and cardinal), and order types 
(these are abstracted from sets and structures). Although the exact nature 
of the operation of abstraction is perhaps not. very clear, the abstract 
particulars which result from it are quite familiar. Notice that these things, 
even though they are highly idealised, are not universals or Platonic Forms, 
They correspond to Plato’s mathematicals. In Book a of the Metaphysics 
(987b) Aristotle observed that Plato 

distinguished, besides objects perceived and forms, a third kind of entity, the 
mathematical, which are ifttermediate, differing from things perceived in being 
eternal and unchanging, and differing from forms in that there are many alike, 
whereas each form is unique. 

‘These mathematicals play an important role in Plato’s epistemology. In 
Book VI of the Republic (5ogD~511E) where he develops his famous 
divided line metaphor for the stages of cognition, knowledge of the 
mathematicals represents the final stage before the highest form of know- 
ledge which is the contemplation of the forms themselves. (See Copleston 
[1960] ch. 19 for a thorough discussion of this point.) 

These abstract particulars, or mathematicals, are also of central im- 

portance in the Kantian philosophy. They make it possible for him to draw 
the vitally important distinction between mathematical and philosophical 
reasoning, in chapter I, section I of the Transcendental Doctrine of Method: 
Philosophical knowledge is the knowledge gained by reason from concepts; 
mathematical knowledge is the knowledge gained by reason from the construction 
of concepts. To construct a concept means to exhibit a priori the intuition which 
corresponds to the concept. For the construction of a concept, we therefore need 
a non-empirical intuition. The latter must, as intuition, be a single object, and yet 
none the less, as the construction of a concept (a universal representation), it 
must in its representation express universal validity for all possible intuitions 
which fall under the same concept. (Quoted from the translation [1965] by 
N. K. Smith.) 
Notice the conditions he lays down for the intuition, whose a priori 
exhibition constitutes the construction of a given concept: it must be a 
singular thing which must nevertheless somehow express or incorporate 
the universality of the concept which it constructs. It is thus precisely the 
kind of abstract particular—the mathematical, intermediate between objects 
of sense and universals—which we have been discussing. 

I have mentioned these examples of the role played by abstract particulars 
in the thought of Plato and Kant in order to call attention to the philo- 
sophical significance of their disappearance from modern mathematics. It 
seems to me odd that with all the talk about ‘Platonism’ and ‘Kantian 
intuitionism’ in the foundations of mathematics, this rather obvious point 
is never made. ° ° 
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This notion of abstract particular is hard to get hold of. Aristotle devotes 
a large portion of Book p of the Metaphysics to a discussion of the difficulties 
to which it gives rise. These objects are, of course, intelligibilia—things to 
be seen only in the mind’s eye. This feature they share with sets. But, by 
virtue of their origin in abstraction, they lack a crucially important property 
which sets possess: these abstract particulars doenot possess clear identity 
conditions. Because of their essential nature as abstract particulars, because 
they attempt to combine both particularity and universality, they are not 
adequately individuated.2 . 

By way of contrast, sets, although they are intelligibles, are not abstrac- 
tions—what could they be abstracted from? As a consequence they do not 
suffer from the lack of a clear-cut criterion of identity and distinctness, 
Indeed even objects of sense compare unfavourably with sets in ‘this 
respect, insofar as it is diffftult actually to articulate their identity con- 
ditions, even though we are fully confident that those conditions exist. 
Because of these clear-cut identity conditions, sets are reidentifiable 
whereas abstract particulars are not. Thus sets have a much stronger claim 
than abstract particulars to recognition as objectively existing entities. 
How do we know, for example, whether the abstract objects perceived by 
two different people (or by the same person on different occasions) are the 
same? Is my intuitive perception of a particular geometrical figure (a set of 
cartesian axes taken in a particular orientation, for example)* the same as 
yours? In what sense are these abstract particulars public objects? And 


1 This observation, however, does not apply to numbers. Those abstract particulars, 
however impenetrable may be their “inner natures”, are quite adequately individuated 
at any rate. No doubt this helpa to explain why they are the only sort of abstract particular 
to have survived the transition from the traditional to the modern outlook in mathematics. 
But the notion of natural numbers as abstract particulars has many other drawbacks, and 
should, in my opinion, be abandoned in favour of an arithmetic based on the notion of 
finite set. Such an approach to the foundations of arithmetic would emphasise, as the 
traditional approach does not, that the central problem for those foundations is the 
analysis of the notion of finiteness. (In this connection, see Zermelo [1909].) 

? Notice that the orientation (the right or left handedness) of a system of Cartesian axes 
provides an excellent example of a definite intuitive property, applying to geometrical 
figures presented in pure intuition, which cannot be defined analytically. Of course the 
relation of right handedness to left handedness can be so defined; but not right handed- 
ness on its own, nor left handedness on its own. In this respect, these properties are in 
strong contrast with other geometrical properties. For example the concept of sphere 
can be analytically defined from the previously constructed concepts of straight line and 
right angle. But the concept of a right (or left) handed Cartesian system of axes must be 
constructed independently of those other constructions. And here I am using the 
expression “to construct” in the Kantian sense meaning “to exhibit a priori in pure 
intuition”. In section 13 of the Prolegomena, Kant argued that a property like right 
(or left) handedness could not be objective (i.e. it could not reside in the Ding-an-sich), 
and that the existence of such properties establishes his claim that space is a mere form 
of sensible intuition. I think that the argument is decisive in favour of his view, but for 
my present purposes it suffices to notice that it shows that these geometrical abstgact 
particulars are, in some sense, not public object. « 
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what kind of a thing is it which cannot be singled out, or reidentified, or 
even perceived by more than one person? None of these difficulties arise 
with sets, and that is why banishing abstract particulars in favour of sets 
represents a real increase in clarity and rigour in mathematics. 

It still remains for me to explain how set theory allows us to avoid 
postulating these peculiar entities. Actually, the method is well-known. 
Instead of extracting these abstract entities by, as it were, dissolving away 
the irrelevant features of the elements of a structure, we nowadays pass to 
a consideration of the tsamorphism type of the given structure. And instead 
of investigating the ‘abstract entities’ obtained in such a way, we ask what 
are the relevant properties (i.e. the properties expressible in the formal 
languages naturally associated with the isomorphism type of the structure 
in question) which hold of any element of any structure isomorphic to the 
given one. Take, for instance, the abstract Euelidean notion of straight line. 
Euclid defines a straight line to be ‘a line which lies evenly with the points 
on itself’, having already defined a line as ‘a breadthless length’ and a point 
as ‘that which has no part’. (See Book I of the Elements.) In modern 
mathematics, this notion is replaced by the notion of straight line in a 
Euclidean structure, where a Euclidean structure is defined to be any 
structure satisfying Hilbert’s axioms for Euclidean geometry. And since 
we can prove that any two such Euclidean structures are isomorphic, and 
since Hilbert’s axioms are all true when interpreted in the traditional 
theory, it follows that we can prove, in the modern approach, exactly the 
same theorems about ‘straight lines’ as before (or rather, more precise 
versions of the same theorems) but without having to postulate abstract 
particulars as forming their subject matter. 

But this elimination of abstract particulars—the Platonic matheiaticals 
—from mathematics is not the whole story. Set theory and the axiomatic 
method have yet another valuable service which they perform. They free 
us completely of the need to consider abstract forms or universals. They do 
this by substituting the study of arbitrary structures ‘possessing’ those 
forms or ‘partaking of’ those universals. Indeed, all trace of reference to 
such forms or universals disappears, owing to the fact that the structures 
in question are specified axiomatically in the manner I have just described. 
Whatever can be said in the old-fashioned way in terms of ‘abstract forms’ 
and ‘universals’ can be reformulated much more precisely and simply in 
terms of sets, structures and formal languages. In this way we are spared 
the difficult task of saying just what sort of things those abstract forms and 
universals are. There is no need to conjure up some hypothetical ‘entity’, 
some je ne sais quot which is the abstract form of a given structure, the 
thing which it and all its isomorphs somehow share. Set theory simply 
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banishes the problem of universals from the foyndations of mathematics 
as irrelevant. 

If we look at modern ‘abstract’ algebra, we find clear confirmation of the 
validity of these observations. Take category theory, for example. Here we 
have a theory which is frequently described as highly abstract. And, to be 
sure, the formation of cencrete categories is, in some ways, analogous to 
the formation of abstract particulars in traditional mathematics. Thus the 
objects of a concrete category are themselves structures, although their 
internal morphology is somehow ‘ignored’ or ‘abstracted from’ when we 
consider the concrete category itself. Since only the isomorphism type of 
the concrete category is mathematically significant, the internal morphology 
of its objects is, indeed, irrelevant. But all of this talk about ‘ignoring 
internal morphology’ occurs entirely on the informal level: it never occurs 
in the statement of a theorem, for example. Formally, categories, even 
concrete ones, are just mathematical structures like any other mathematical 
structures. Indeed, this is the whole secret of why algebra in general, and 
category theory in particular, actually work. All algebraic theories occur on 
the same level formally, no matter what their ‘level of abstraction’ may be 
in the informal or intuitive sense. Thus when group theory is looked at 
from the point of view of category theory, the groups themselves are looked 
at more ‘abstractly’, if you will. Instead of studying particular groups we 
study categories of groups.’ In this way what previously appeared at the 
level of structure (namely particular groups) now appears at the level of 
point-in-a-structure. But strictly speaking—and that is the way we must 
speak here—strictly speaking, categories of groups are just mathematical 
structures in precisely the same sense that groups are themselves. From a 
strictly formal point of view, they are no more ‘abstract’ than groups are. 
And this, of course, is precisely why category theory works. 

Thus even if abstraction does have a role to play in modern mathematics 
—and one must have serious doubts about this—it does not occur in either 
definitions or proofs. Its function can only be an informal, heuristic one. 
Incidently, notice the nice irony here. Set theory—that supposedly 
Platonistic theory—provides the very means whereby both the Platonic 
mathematicals and the Platonic forms are eliminated from mathematics. 
Those Platonic notions which played so central a part in the traditional 
abstractionist mathematics have now simply disappeared. This destroys 
the whole point of Plato’s famous divided line metaphor, and thereby calls 
into question the position granted to mathematics in the Platonic scheme 


1 Some category theorists would, no doubt, prefer to say “‘the category of all groups” here. 
But I am anticipating the conclusion, which I shall shortly put forward, that such all 
inclusive categories do not exist. I have already hinted as mach in the earlier discussidn. 
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of things. It really makes very little sense to refer to modern’ classical 
mathematics as ‘Platonism’. Nor, come to think of it, does it make any 
better sense to regard set theory as a revival of scholastic realism. On any 
reasonable analysis, it is essentially nominalistic in spirit. 


1.6 The category theorists’ case against set theory refuted. 
Let us go back now and look at -Lawvere’s [1966] claim that: 


the relevant properties of niathematical objects are those which can be stated in 
terms of their abstract structure, rather than in terms of the elements which the 
objects were thought to be made of. 


If we attend to the literal meanings of the words, rather than merely 
allowing them to roll sonorously off the tongue, we find that the statement 
as it stands is quite simply false. In mathematics one simply does not state 
properties of mathematical objects ‘in terms of their abstract structure’. 
No mathematical theorem ever has been, or ever will be of the form: 


¢{the abstract structure of the natural numbers) 


for example. Of course we all know what Lawvere had in mind. The 
relevant properties of mathematical objects are those they share with 
isomorphic objects. Now the difference between this formulation and 
Lawvere’s is not simply one of style. It is a matter of rigour. Nor is it 
pedantic to insist, in this instance, on a rigorously correct formulation. For 
by replacing vague talk about the ‘abstract structure’ of objects with talk 
about isomorphism, the essential role of set theory in formulating the idea 
becomes apparent. We do not define isomorphism in terms of identity of 
‘abstract structure’. We rather abandon the vague notion of ‘abstract 
structure’ in favour of the mathematically precise, set-theoretical notion of 
isomorphism. Lawvere’s observation, as I have rephrased it, would never 
tempt anyone to speculate on the superfluity of set-theoretical methods, 
since the set-theoretical content of the observation itself has been made 
explicit. 

It is obvious from everything I have just said that the proposal made by 
certain category theorists to replace set theory by a foundational theory 
which more adequately reflects the ‘abstract’ nature of modern mathematics 
is completely misguided. There is no opposition between a ‘set-theoretical” 
way of looking at mathematical structures, and a category-theoretical one. 
Set theory is a logical theory. It provides us with the very notion of a 
mathematical structure. Category theory deals with a particular kind of 
mathematical structure—to be sure, a particularly fundamental kind. There 
can be no question of rivalry hare. 
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Of coufrse there is a loose sense in which it is true to say that category 

theory deals with ‘form’ and set theory with ‘content’, But much more 
significant is the fact that set theory provides the means whereby category 
theorists, and other algebraists, can get on with their consideration of 
‘abstractions’ without really, literally having to employ the opaque and 
difficult notion of abstract form. The algebraist who disdains consideration 
of the actual content of his structures, is like the architect who refuses to 
think about bricks and mortar. For most-ordinary purposes, this will not 
impair his effectiveness. But occasionally his failure to consider carefully 
the nature of his materials may lead him to attempt the design of structures 
which cannot actually be built (e.g. the meta-category of all groups). 
_ It seems to me that much of the general dissatisfaction that category 
theorists feel for set-theoretical foundations stems from a deep-seated 
temperamental antipathy tosthe sort of considerations which any founda- 
tional theory must take up. Mathematical logic and set theory, insofar as 
they are employed in the foundations of mathematics, are, after all, applied 
disciplines. Their development is controlled by external considerations— 
by hard facts, if you will. They cannot be allowed to develop their own 
internal impetus, but must, like all applied branches of mathematics, keep 
their connections with the ‘phenomena’ they are called upon to explain. In 
this respect they stand in sharp contrast to algebra, whose essence is the 
avoidance of external obstacles. Algebra is the most free and untrammelled 
branch of mathematics, and consequently the branch furthest in spirit from 
the kind of mathematics in which applications are the central concern. It 
is therefore only to be expected that a first rate algebraist would feel 
instinctively that something was ‘wrong’ with the kind of mathematics 
done in foundations studies. 

But beyond these psychological speculations concerning the origins of 
the category theorists’ general dissatisfaction with set theoretical founda- 
tions there is the patent and undeniable fact that they have quite legitimate 
specific complaints. These complaints are not against set theory itself, how- 
ever, but against the standard formulation of set theory, t.e. its formulation 
in conformity with the conventional view of classical logic. And here it 
must be said that the all-inclusive concrete meta-categories which figure 
so prominently in these complaints are simply illusions, generated by this 
standard formulation of set theory. There is no meta-category of all groups. 
‘There is, indeed, no collection of all groups. Given any collection of 
‘groups, one can easily construct groups which are not in it using simple 
cardinality considerations. There is no need for the distinction between 
meta-categories and categories. Nor is there any need for the distinction 
between large and small categories, either: the arguments which purport 
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to establish the need for the latter distinction are simply circuflar.1 The 
foundations of category theory are much simpler and more straight- 
forward than the existence of all of these confysing distinctions would 
suggest. Basically what-we have here is a system of axioms and the 
structures which satisfy them. And that is really all there is to it. 

All of this will become clear when I show how to eliminate the incon- 
gruities and foundational perplexities arising out of the conventional view 
of classical logic. And this is the task to which I shall now address myself. 


1.7 The alternative to the conventional view of classical logic. 


What is required is an acceptable alternative to the conventional view of 
classical logic and set theory, an alternative which does not lead to the 
difficulties which I have been discussing. It 4s not necessary to alter the 
basic principles of set theory—certainly not necessary to replace set theory 
by some other foundational theory. What is needed is simply to get the 
connections between set theory and classical predicate logic properly sorted 
out. Once this is done, everything else will fall into place, and, in particular, 
it will become apparent that no special effort will be required in order to 
give a perfectly adequate account of the foundations of category theory. 

Let us look, then, at the connection between classical logic and set 
theory. The basic question is this: what domains of variation are to be 
allowed for quantified variables? That is: what can serve as the universe of 
discourse for a conventionally formulated classical language? 

First of all, let us form the resolution that the answer to this question ts to 
be provided by the theory of sets. In other words, let us identify set theory 
with the study of all possible domains of variation for classically quantified 
variables, This resolution is not to be taken casually. It is the single most 
important step in the overthrow of the conventional view of classical logic. 
For it is by failing to adopt this attitude towards set theory that the 
adherents of the conventional view make their first and most serious 
mistake. We are standing at a crossroads. In one direction lie the two-set- 
theories doctrine, the doctrine of the proper class, all the pettifogging 
distinctions between small categories, large categories, meta-categories etc. 
This is the direction taken by those who accept the conventional view. If 
we are to avoid all of the difficulties attendant on these notions, we must 
take the other direction. And that means that we must construe set theory 
as the study of all possible classical universes of discourse. 


1 For an illustration of this, see Freyd [1964], p. 86. The whole point of his argument is 
logt, because he has, in effect, incorporated the rfotion of proper class into his definition 
of complete category (cf. p. 26, line 5, gvhere he talks of products over arbitrary index sets). 
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Let us'begin with the logic. Classical logic is extensional logic. It deals 
with truth values rather than propositions, and with collections taken in 
extension, rather than with properties. Now the essential point about a 
collection taken in extension is that it is completely determined by what 
its members are. It is entirely independent of the way,in which its elements 
are specified as being its elements. There is nothing to the collection whose 
elements are a, b,c,... other than that its elements are a, b,c,.... Ex- 
tensional collections are thus completely simple and transparent. They 
have no internal structure. They are indeed so simple that there is no 
simpler concept in terms of which they can be defined. But since the 
domain of variation of a classical bound variable is always such a collection, 
and since, conversely, any such collection can serve as such a domain of 
variation, the classical laws of logic can perhaps provide a kind of opera- 
tional or pragmatic definitton of a collection taken in extension: it is 
whatever these laws hold good for. 

Thus, in virtue of the resolution we have just formed, set theory is the 
study of collections taken in extension, and we identify the concept of set 
with that of extensional collection. Notice that all I am doing here is 
specifying the meaning which I intend to give to the term set. Whether 
sets in this sense can be identified with sets in the sense of Zermelo- 
Fraenkel remains to be seen. (I maintain that they can.) But in any event, 
it is clear that by definition no classical universe of discourse can fall outside 
the scope of set theory, understood in the sense I have just given. 

If we now assume that sets form a completely homogeneous system, so 

‘that all the usual operations for forming new sets from previously given 
ones can be uniformly applied, it is easy to see that there can be no 
universal set, and consequently no all-inclusive sets like the set of all 
groups, either. All that is really required to establish this is Zermelo’s 
principle of Aussonderung by means of which it is always possible, given a 
set a and a decidable property ¢—definit Eigenschaft in Zermelo’s termino- 
logy—to form the set, {x e a | ¢(x)}, of all those elements x of a which 
satisfy the given property ¢. This allows us to use the well known Russell 
argument to produce a counter-example to the supposed universality of 
any given set. Of course this assumption that sets form a homogeneous 
system, an egalitarian regime in which all sets are treated equally, being 
subject to the same basic laws and operations—this assumption is by no 
means the only one possible. Nor is it necessarily the most plausible 
assumption that could be made. It is, however, the simplest. And what is 
more important, it is the assumption which leads to the classical system of 
set theory, the system employed instinctively by ie overwhelming majority 
of present day mathematicians. i 
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But what are the consequences of these considerations for formal logic? 
And how do we incorporate them into a formal system of set theory? 
Strictly speaking, no change need be made in the conventional formulation 
of classical predicate logic at all. Of course from the new point of view it 
would not be possible to apply conventional predicate logic to set theory, 
so the problem of formalising set theory would still remain. But simply to 
retain the conventional formulation of predicate logic is not the natural 
thing to do. It would violate the spirit of the new point of view, and could 
easily lead to a relapse to`the conventional view of logic with all that that 
entails. The simplest and most natural thing to do is to effect a simultaneous 
reformulation of both classical predicate logic and set theory, incorporating 
them into a single general system of classical logic. Such a system would 
be a universal system, in the spirit of Frege’s Begriffschrift or Russell’s and 
Whitehead’s Principia Mathematica, but with this important difference: 
it would represent a formalisation of mathematics as it is actually practised, 
not, as it were, a rival approach into which current practice could be some- 
how translated. 

I have worked out the details of such an approach elsewhere ([1978]), 
so I shall content myself here with a brief sketch. What is required in 
order to accommodate predicate logic to a single system is a simple, one 
must say trivial, technical stratagem. One simply requires that whenever a 
quantifier is used, the domain of variation of its variable must be explicitly 
specified using an appropriate set term. Thus instead of writing 

(vax) (x) 
and then stipulating that the variable x ranges over the set S, one writes . 
(Vx e A) o(x) : 

and stipulates that 4 takes the value S. What could be simpler? And yet, 
once this convention is adopted there is no longer any need to regard first 
order logic as fragmented into an infinite number of distinct formal 
languages, one for group theory, one for arithmetic, one for the theory of 
vector spaces, etc. After all, the only point in having all these different 
languages was to keep the quantifier domains from clashing. First order 
logic thus becomes a single, unified system. And this is not all. With set 
terms explicitly present in the language, it becomes quite natural to subject 
them to elementary set theoretical operations. In this way the familiar 
boolean operations, the operations of comprehension and of replacement, 
and the set-theoretical predicates of membership (e) and inclusion (€) are 
all incorporated into a reformulated and expanded system or first order 
logic. Furthermore—and this is the important point—all of the model 
theoretic and proof theoretic definitions, constructions, techniques, 
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theorems, etc., of the classical first order predicate calculus in its con- 
ventional formulation, carry over, mutatis mutandis, to the new formulation. 
This new approach todogic makes the relationship between classical logic 
and set theory perfectly clear: classical predicate logic is a subsystem of set 
theory. Roughly speaking, first order logic coineides with those set 
theoretical constructions and arguments in wHĦich a rigid distinction is 
maintained between individuals and sets. A set may occur as an individual 
—anything counts as an individual—but not as both a set and an individual. 
(For a more detailed description of the unified system, see the appendix.) 
In particular, classical predicate logic is not the underlying logic of set 
theory. It lies inside set theory not ‘under’ it. Set theory does not have an 
underlying logic, because it ts the underlying logic of all of classical 
mathematics. 


1.8 The well-foundedness of the membership relation. 


Now if our system of set theory is accurately to reflect the actual practice 
of present day mathematics and to contain those modes of reasoning and 
construction which are universally accepted in the mathematical com- 
munity, it must then incorporate the principles of set construction included 
in Zermelo-Fraenkel. In particular the operations of pair set, power set, 
sum set, and the principle of definition by transfinite induction must all 
be accepted as universally applicable. And all these constructions are 
intuitively plausible, given our analysis of set as collection taken in ex- 
tension. We have therefore come very close to the conclusion that our 
notion of set coincides with the Zermelo—Fraenkel one. However, we still 
need to consider the question of the well-foundedness of the membership 
relation. In other words we must decide whether the most general notion 
of extensional collection coincides with what Gödel calls the notion of 
set of, the notion of set as an element of the cumulative hierarchy of Zermelo 
and Mirmanoff. 

This question goes straight to the heart of what is meant by a collection 
taken in extension. Is it possible to make sense of the notion of a non- 
well-founded collection of this sort? Is it, in other words, possible to have a 
collection of collections (all taken in extension) from which an extensional 
subcollection can be extracted which does not contain a minimal element 
in the sense of the membership relation? The crucial, determining fact 
here is that all collections are to be taken in extension. For there is nothing 
to the collection A taken in extension other than that it contains the 
objects a, b, c, . . . and no others as its elements, where a, b, c, . . . must, in 
Cantor’s famous phrase, be ‘definite, properly distinguished objects of our 
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intuition or our thought’. But if the element a in A is itself a collection, 
then its existence as a well defined object requires that its members 
x,y, %,..., in their turn, be ‘definite, properly distinguished objects of our 
intuition or our thought’, and so on. The fact that all collections are taken 
in extension imposes severe restrictions on the methods of argument that 
can be employed to establish the existence of a collection corresponding to 
a given specification. Such an argument must show that Cantor’s criterion 
is satisfied. And no appeal can be made to the way in which the supposed 
collection is specified. This means that such an argument must always—at 
least in principle—be reducible to, or entail the existence of, an argument 
of the following form: 





D(u) Dv)... D(w) D(x)... Diy) D(z)... 
D(a) - D(b) D(c) 
D(A) 
where D(s) stands for ‘s is well defined’ and the only rule of inference 
permits the conclusion D(s) from the premisses D(r), D(#),..., where 
r, t,... are the members of s. 


A non-well-founded collection would thus require a non-well-founded 
argument to establish its authenticity in terms of Cantor’s criterion. But it 
is obvious that no argument whose premisses proceed in a circle or regress 
to infinity can be valid. Hence the notion of a non-well-founded extensional 
collection is a vacuous one. ‘ 

I believe that these considerations throw some light on the real meaning 
of Russell’s victous circle principle. Let us say that an extensional collection 
has only one essential or canonical definition, namely by the specification of 
its elements, and the specification, in turn, of their elements, and so on. 
Then the well-foundedness of extensional collections would follow im- 
mediately from the following extensional version of the vicious circle 
principle: there can be no infinite regress nor vicious circle in the canontcal 
definition of an extensional collection. Russell’s own formulation of the 
principle is for non-extensional properties, and therefore seems less 
plausible—especially since it is at odds with the way mathematicians 
actually reason. In the extensional form, however, as a principle relating 
to collections taken in extension, it seems unavoidable. 

Of course none of these arguments for well-foundedness is rigorously 


r This i issue is discussed at leñgth by Gödel in his (roa where he makes (p. 222, final 
paragraph) essentially the same point I am making he 
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compelling. It is in the very nature of the case that this must be so. We are, 
after all, discussing what ought to count as a compelling argument in 
mathematics. What these arguments really amount to is a challenge to 
anyone who does not accept their conclusion: try to form a clear picture 
of what a non-well-founded collection might be. I believe anyone who 
tries this will see why extensionality force$ us to accept the well- 
foundedness of the membership relation. 


1.9 Conclusion. 


Let me now briefly recapitulate the discussion. I began by describing the 
conventional view of classical logic, the view of logic which naturally arises 
out of the standard formulation of the predicate calculus. I then showed 
how the formalisation of set theory in conformity with the conventional 
view leads to serious difficulties, in particular, to the two set theories 
doctrine, according to which set theory splits into an official formalised 
theory and an unofficial informal one, and to the familiar doctrine of the 
proper class. This led naturally to an examination of the consequences of 
these set-theoretical difficulties for the ordinary practice of mathematics. 
I traced the effects of the conventional view in the foundations of category 
theory, showing how all the confusing and unnecessary distinctions between 
the different sorts of categories arise directly out of the corresponding 
equally confusing and equally unnecessary distinctions in set theory. 1 
observed how the clamour among category theorists for the replacement of 
set theory as a foundational theory, insofar as it did not arise out of 
legitimate dissatisfaction with the conventional formalisation of set theory, 
was based on a complete misunderstanding of the nature of set theory’s 
role in the foundations of mathematics. Finally, I showed how, starting 
from the determination to develop set theory in such generality as to 
encompass all possible classical domains of discourse, we are led to a 
simultaneous reformulation of set theory and predicate logic, incorporating 
both into a single, unified, universal system of classical logic. And, by 
assuming that sets are collections taken in extension, and that the basic 
operations of set theory are universally applicable, I was able to show that 
the intuitive notion of set which underlies this universal system is the 
familiar Zermelo-Fraenkel notion of set as a member of the cumulative 
hierarchy. 

All of these considerations have.thus led me to recommend a kind of 
free variable version of Zermelo—Fraenkel set theory. It stands to tradi- 
tional Zermelo-Fraenkel set theory, formalised as a conventional first- 
order theory, in much the same relation that Skolem’s free variable 
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arithmetic stands to classical first-order Peano arithmetic. This analogy is 
a significant one, and one which I shall attempt to exploit in due course. 
But notice: I have been led to adopt this free variable system of Zermelo- 
Fraenkel on purely logical grounds. In particular, I am not trying to play 
it safe by not allowing classical quantifiers over the supposed universe of 
sets. Nor am I trying in any way to diminish the compass of set theory. 
On the contrary, I am trying to increase it. As I showed earlier, no truly 
universal theory of sets, no such theory which includes every conceivable 
classical universe of discourse as an object inside the theory, can possibly 
be formulated as a conventional first order classical theory. For the 
semantics of first order theories would then require the existence of a 
universe of discourse for that theory. And such a universe would have to 
be outside the purview of the theory itself. , 

Rather than say ‘You can’t quantify classically over the universe of 
sets.’, I should prefer to say ‘Whenever you can quantify classically over a 
domain, that domain is just a set, with a power set, which, in its turn, has a 
power set, and so on. Therefore there is no universe of sets, in the sense of 
a domain over which classical quantification is permitted, and which 
contains every set as an element.’. There is thus no reason why this new 
point of view should be regarded, a priori, as a weakening of set theory. 
On the contrary, whenever we are confronted with a system of set theory 
formulated in a conventional classical predicate calculus of first or higher 
order, then we should take its intended interpretation to be a set, and 
attempt to construe all the arguments, if any, in its favour, as arguments 
for the existence of such a set. For example, the case for Morse’s system 
should be regarded as a case for the existence of a strongly inaccessible 
cardinal number, and so on. 

Set theory should thus be seen as the most general, or the absolute, 
system of higher order logic; or perhaps, since it is necessarily incomplete, 
as the ultimate framework for higher order logic. It provides the language 
for that logic. This means that increases in order are to be achieved, not by 
sticking new types on top of the theory, as it were, but by sticking them 
inside it, using the appropriate axioms of infinity. 

I have been led to this reformulation of set theory and classical predicate 
logic by practical, common-sense considerations of the way those theories 
are actually used. The path I have followed seems to me an inevitable one, 
given the basic restraints I have imposed upon myself. My guiding 
principles have been to avoid highly speculative or controversial considera- 
tions, and to conform to the fundamental consensus on practical founda- 
tional matters manifested,in the practice of contemporary mathematicians, 
if not always in their expressed opinions. 

c 
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I THE SPECIAL THEORY OF RELATIVITY: BASIC IDEAS 

Einstein’s paper ‘Zur Elektrodynamik bewegter Körper’ which appeared 
in 1905 in the Annalen der Physik contained the basic ideas of what is now 
called the special theory of relativity, no doubt one of the boldest and most 
interesting theories ever designed.1 Generations of physicists, mathemati- 
cians and philosophers were inspired by it. Bitter battles were fought over 
its principles and deductions which clashed with many an honourable idea 
backed up by honourable philosophical and scientific traditions. Yet all 
the knocks and bites from left and right seem to have left Einstein’s theory 
unshaken. Today the battle-grounds of yesterday look rather deserted. The 
critic is likely to be silenced in the face of the theory’s spectacular success. 

Tn the following I shall try to throw some new light on one or two old 
problems which Einstein’s theory gives rise to. Section 2 is concerned with 
the obstacles we are faced with when we try to combine Einstein’s special 
theory of relativity with a causal account of relativistic phenomena. The 
argument does not show, of course, that Einstein’s theory is false, but it 
points to a possible incompleteness and to certain limitations inherent in 
the special theory of relativity. In section 3 a substratum theory of relativity 
is outlined which seems to be as simple and coherent as Einstein’s special 
theory of relativity; the Lorentz transformations are deduced from two 
basic principles without making use of the principle of relativity. Finally, 
in section 4, the two theories are compared and some general conclusions 
are drawn. 

But let us first have a short look at the ideas underlying Einstein’s 
theory. To begin with, the special theory of relativity is a transformation 
theory. This is its most fundamental aspect. Suppose we know the space 
and time co-ordinates of an event particle e (event with negligible spatio- 
temporal extension) in an inertial frame of reference F; then the theory 
1 Einstein [1905], p. 89x. 


Received 1 May 1975 


36 A. Grieder 


enables us to calculate the space and time co-ordinates of e in any other 
inertiat frame F’. A frame of reference is some kind of platform which 
contains three material lines issuing from a point and perpendicular to one 
another, along with a mobile measuring-rod and a number of clocks. By 
an inertial frame is meant a frame in which the law of inertia is valid, or— 
which comes to the same thing—in which the laws of mechanics assume 
their simplest form.+ In order to arrive at the transformation in question 
Einstein considers an inertial frame F and another frame Æ” such that the 
following is the case: the X-axis of F coincides with the x’-axis of F’; the 
’ origin O’ of F’ moves along the X-axis towards increasing values of x, its 
speed being constant and equal to v; the Y- and Z-axes of F are all the time 
parallel to the Y’- respectively Z’-axis of F” and coincide with them when 
local clocks in O and O’ show the time t = t = 0; it is also assumed that 
the clocks and rods used in the two frames are in all respects alike. Apart 
from these: kinematic stipulations Einstein introduces two simple and 
general principles, the light principle and the principle of special relativity. 
The light principle says that in a vacuum light is propagated with constant 
velocity, at least with respect to a definite inertial frame F.* Thus, in F the 
speed of light in vacuo is found to be independent of the direction of 
propagation, independent of the motion of the emitting source, and 
independent of the wavelength. The principle of special relativity asserts 
that all laws of nature are the same in all frames of reference moving 
uniformly relative to each other.*. Whatever the meaning of ‘law of nature’, 
Einstein certainly considers the light principle to be a law of nature. For, 
he immediately draws the conclusion that light i vacuo is propagated 
with ohe and the same velocity in all frames of reference moving uniformly 
and without rotation relative to an inertial frame. This statement con- 
stitutes the hard core of the special theory of relativity. However, two 
additional considerations are needed to derive the transformations we are 
looking for. First, as Einstein points out, the transformations must be 
linear on account of the homogeneity of space and time. Secondly, Einstein 
refers to what I shall call the symmetry convention: the length, as measured 
in F, of a unit rod belonging to F”, at rest in F” and parallel to the X’-axis, 
equals the length, as measured in F’, of a unit rod belonging to F, at rest 
in F and parallel to the X-axis. (A similar symmetry convention could be 
introduced for clocks). Now, from these assumptions we obtain the well- 
1 Einstein [1967], p. 23. Einstein [1968], p. 12. 

2 Einstein [1905], p. 41. Einstein [1967], p.'26. Einstein [1968], p. 17. 

3 Einstein [1905], p. 41. Einstein [1967], pp. 23-4. Einstein [1968], p. 13. 

1 Einstein [1905], p. 45. Einstein [1967], p. 28. Einstein [1968], pp. 19-20, 31-3. 


3 Einstein [1968], pp. 117-18. In Einstein [1905], pp. 41-2, 47-8 and Einstein [1967], 
Pp. 33-4, a slightly different procedure is applictl. 
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known Lorentz transformations (or, to be more precise, a subgrotip of the 
general Lorentz transformations): 


L x—vul z 
-yilo 
yY =y, Z = 3, 
pa To), 
- V1—(v/cP 

In view of the following analysis I would like to underline two aspects 
of Einstein’s procedure. First, as Einstein’s derivation of the Lorentz 
transformations depends upon the symmetry convention the same must ` 
apply to his account of relativistic laws of nature such as, e.g. the mass- 
energy relation, Fresnel’s dragging coefficient, the Doppler effect. But 
seeing that these laws can, in principle, be diseovered without referring to 
more than one inertial frame, observers in an inertial frame F are justified 
in demanding an explanation of the laws which is independent of the way 
spatial and temporal periods are measured in other inertial frames. Secondly, 
it is often overlooked that Einstein’s principle of special relativity is only 
one of two principles of relativity to be found in classical mechanics. The 
principle Einstein makes use of concerns the comparison of laws of nature 
in a frame F with laws of nature in other frames which are in uniform 
rotation-free motion relative to F. Now, the second principle of relativity 
has to do with the behaviour of similar physical systems S, S’, S”,... all 
of which are observed from one and the same inertial frame F. S is 
supposed to be at rest in F while S’, S’,...are in uniform rotation-free 
motion relative to F. The second principle of relativity then states that 
whatever the inertial frame F and whatever the nature of the similar 
systems S, S’, S”,...the laws of mechanics apply to all of them in the 
same way. It is true that we have to demand that the influence of bodies 
which do not belong to the systems must be negligible; however, the 
validity of Einstein’s principle depends upon similar assumptions (in 
particular upon the assumption that the influence exerted by the particular 
structure of the frame of reference can be neglected). Suppose, e.g. that 
two railway carriages are each equipped with a gun which is rigidly 
connected with the carriage and points in the direction of the X-axis of F. 
Let us assume that the two systems are in all respects alike, and that the 
first is at rest in F while the second is moving uniformly and without 
rotation along the X-axis of F; if the guns are fired, then according to the 
second principle of relativity observers in F should find that the relative 
speeds of bullet and gun are identical in the two cases. It is evident, 
however, that the second principle of relativity is violated in case we are 
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concerned with mechanical phenomena which involve speeds comparable 
to the gpeed of light. It is also evident that the principle cannot be extended 
to electrodynamics: it would contradict the light principle. Ritz, in his 
emission theory of light, assumed that light ‘emission takes place in 
accordance with the second principle of relativity; hut his hypothesis did 
not agree with observation and experiment. It is in this fact, that the 
second principle of relativity cannot be incorporated into the special 
theory of relativity, that the gulf manifests itself wich separates Einstein’s 
theory from classical mechanics. 


2 RELATIVISTIC EFFECTS AND THEIR CAUSES 


The argument elaborated in this chapter is a philosophical argument, 
essentially similar to Einstein’s famous arguments against the Copenhagen 
interpretation of quantum mechanics.! If it is correct, it shows that the 
special theory of relativity provides an unsuitable framework for the causal 
analysis and explanation of relativistic phenomena. (When reading the 
following paragraphs some readers might be reminded of the clock 
paradox and related problems. So I should perhaps state that I am not 
concerned here with what is normally called the clock paradox and that 
in my view this paradox does not amount to an inconsistency in Einstein’s 
theory.) 

Clock retardation is a well-known and fundamental relativistic effect. 
Let A and B be clocks which are in all respects alike and of the kind 
admitted in the special theory of relativity; and let A be at rest in F while 
B is moving uniformly and without rotation along, say, the X-axis of F. 
According to Einstein’s theory observers in F will find that the rate of B 
is slowed down if compared with the rate of A; and they are certainly 
justified in searching for a material cause of this remarkable discrepancy. 
What possible alternatives does the special theory of relativity suggest or 
allow for a causal explanation of the relativistic retardation effect? I shall 
consider four answers to this question and show that neither of them is 
satisfactory. 


(1) The discrepancy is due to the motion of the clocks relative to certain bodies, 
e.g. relative to the clocks themselves or the frame of reference. However, 
it is overlooked here that according to the special theory of relativity clock 
retardation is a vacuum effect which is assumed to occur even if only clock 
B is present. Moreover, whatever the size, mass, chemical properties and 
distance of the clocks used, the retardation effect is unaffected by such 


1 Einstein [1948],*p. 320. E 
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factors. Von Laue thought that clock retardation (and rod contraction) 
must be explained by taking into account the causal influence exerted by 
the inertial frame itself.* Again, the influence of the bodies making ‘up the 
inertial frame would have.to be the same everywhere and could not depend 
upon the individual nature of the ‘platform’ we happen to use as an 
inertial frame. This is certainly a strange kind of influence and the special 
theory of relativity has done nothing to make this hypothesis a plausible one. 


(2) The discrepancy is due to a medium and its interaction with the clocks 
A and B. The interaction is supposed to depend upon the relative speed of 
clock and medium. Hence, the medium must be given a state of motion 
with respect to the inertial frame F. Now, the idea of such a medium was 
rejected by Einstein.* Indeed, it is difficult to see how it could be made 
compatible with the principle of special relativity. If the medium M is 
given a state of motion, then there exists a preferred frame of reference F, 
namely the one which is at rest in M. Observers in F would be justified in 
considering the following statement to be a law of nature: ‘the rate of a 
clock which is moving uniformly through the medium is slowed down as 
compared to the rate of a clock which is stationary in the medium’. 
However, the statement does not hold in all other inertial frames. Thus F 
cannot be equivalent to all other inertial frames, and the principle of 
special relativity is violated. 


(3) It is the difference in speed relative to space which gives rise to the dis- 
crepancy. A rather desperate idea! For space itself is not something 
material and its interaction with material things would be a rather 
mysterious affair. It seems strange that adherents of the special thteory of 
relativity should dismiss the idea of a basic substratum on the grounds 
that there is no experimental proof for its existence and at the same time 
introduce the idea of an interaction between material things and a non- 
material entity, an interaction which seems to be beyond the realm of 
observation altogether. Furthermore, what is meant by ‘speed relative to 
space’? Either there is only one space, or there are many spaces which 
influence the behaviour of the clocks. In the first case, there exists once 
more a preferred frame: the frame which is at rest in space, and the 
principle of relativity becomes questionable. In the second case other 
difficulties arise. If there is to every space relative to which clock A moves 
with speed v another space relative to which B moves with speed v, and 
vice versa, then the two clocks would be affected in an entirely symmetric 


1 Von Laue [1921], vol. 1, p. 62. 
2? Einsteir [1905], pp. 37-8. Einstein [1968], p. 53. 
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way. In drder to obtain an asymmetric effect, in any given inertial.frame, 
we may demand that each space be associated with such a frame. But what 
reason is there for this restriction? It should also be remembered that the 
special theory of relativity is part of the general theory of relativity. In 
the latter Einstein was guided by Mach’s idea of eliminating space as an 
active cause in the system of mechanics.1 To accept this idea for the general 
theory of relativity while rejecting it for the special theory must be con- 
sidered highly unsatisfactory. 

(4) Finally, there is the view that the discrepancy in question is not a physical 
reality at all, but merely a consequence of the way clocks are compared, in 
particular of the Einsteinian definition of simultaneity. Similarly, the change 
of length which rods undergo as a result of their different states of uniform 
translatory motion is said tb be merely a consequence of procedures, as 
prescribed by Einstein’s theory, for comparing lengths, in particular of the 
relativistic definition of simultaneity for spatially distant events. These 
views, defended, e.g. by Born, appear at first sight to offer an escape from 
the trilemma stated above.? A closer examination shows, however, that they 
cannot be upheld. It is, in principle, possible to establish the existence of 
the discrepancies mentioned above, without making use of Einstein’s 
simultaneity definition for spatially distant events and by utilising only 
procedures, such as, e.g. coincidence judgments, which are basic to both 
relativistic and classical physics. Let us consider three solid rods AA’, 
BB’, CC’ which are in all respects alike. For simplicity we suppose that 
AA’ is at rest in the inertial frame F and that its ends A and A’ coincide 
with the X-axis of F; BB’ and CC’ we suppose to move uniformly in the 
same direction (say towards increasing values of X) and in such a way that 
their ends B and B’, C and C” coincide with the X-axis during the time 
.the experiment is taking place. We now assume that at one, and only one, 
moment of time during the experiment the left ends A, B, and C coincide 
with each other. If it is observed that the right ends A’, B’, and C” do not 
simultaneously coincide at any moment during the experiment, then it can 
be concluded that at least one of the rods BB’ and CC’ must have under- 
gone a change in length. A similar argument can be constructed by using 
three clocks A, B, and C which are in all respects alike. Let A be at rest 
at the origin O of the inertial frame F while B and C are moving uniformly 
and in opposite directions along the X-axis of F. We assume that B 
coincides with A and, at a later time, with C. On coinciding A and B are 
set so as to indicate the same time. When at some later time C coincides 
with B it is set so as to show the same time as B shows at the moment of 

1 Einstein [1967], pp. 54-5. . 2 Born [1969], pp. 218-20. 
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coincidence. Finally, when A and C coincide at O their readings %, and tg 
are compared. The arrangement is well-known from discussions of the 
clock paradox (or twin paradox). It is thus, in principle, possible to 
establish that ¢, >t, without referring to Einstein’s definition of simul- 
taneity for spatially djstant events. That not all the three clocks go at the 
same rate must therefore be considered a fact -which is prior to and 
independent .of the Hone alamaa relativistic procedures for comparing 
the rate of clocks. 

We are thus led to the cenclusion that the special theory of relativity bars 
the way to a causal account of relativistic phenomena such as rod con- 
traction and clock retardation. In particular, it rules out the most attractive 
alternative: a causal explanation in terms of a basic substratum (‘ether’) 
whose interaction with bodies contained in it -depends upon their speed 
relative to the substratum. 


- 3 THE PRINCIPLE OF INDEPENDENCE AND THE LORENTZ 
TRANSFORMATIONS 


In this section it will be shown how the Lorentz transformations can be 
arrived at without the help of the principle of special relativity. We are 
concerned with the behaviour, as judged from a definite inertial frame F, 
of similar physical systems. Two physical systems we call similar if, and 
only if, they are in all respects alike when compared in a state of rest 
relative to F. Let S and S’ be two similar physical systems, and let S be 
stationary in F while S’ moves rotation-free with a constant speed v along 
the X-axis of F towards increasing values of x. Let us further assume that 
observers in F observe corresponding event particles in the two systems, 
say e in S and e’ in S’. We wish to find a linear transformation T which 
expresses the space and time co-ordinates x’, y’, 2’, t' of e’ as a function of 
the space and time co-ordinates x, y, 3, t of e.1 In order to simplify things 
a little we assume that if the event particle e) of S has the co-ordinates 
x = 0, y=0, Z = O, t = 0, and if the event particle corresponding to ey 
in S’ is ej, then the co-ordinates of ej are x’ = 0, y' = 0, & = 0, Ë = 0. 
It should be emphasised again that both sets of co-ordinates are measured 
in one and the same inertial frame F. 


1 Compare Einstein [1967], pp. 30-4, Einstein [1905], pp. 44-8. Einstein [1968], pp. 
115-2q. A transformation T is linear if, and only if for any vectors a, b, c and any real 
number r we have T(a+b) = T(a)+T(b) and T(rc) = rT(c). Thus, if two equal rods 
are put together such as to form a rod of double the length, the effect T has upon its 
length is twice the effect it has on each of the two constituents. If this were not the case, 
rods which are in all respects alike could be affected by T in different ways, depending 
upon their place in space. Analogous considerations apply to successive periods of a 
clock. The linearity of T expresses = (local) spatio-temporal homogeneity of the basic 
substratum. 
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We postulate the following: 
(i) there exists an inertial frame F such that the velocity of light 
(in vacuo) is c = const. (light principle); 
(ii) the transformation T is independent of the nature of the physical 
systems S and S’ (principle of independence). 


By means of these two principles we can demonstrate that T must be a 
Lorentz transformation. To begin with, the principle of independence 
assures us that in order to derive T we can choose whatever kind of 
physical system we wish provided only that there exist systems similar to 
it to which various uniform velocities can be imparted. Let us consider, 
then, a system S which consists of two rods AB and AC which form a solid 
angle, and of a small light source L which is attached to the solid angle at A 
(see the figure below). Corgespondingly, S’ consists of two rods A'B’ and 
A’C’ forming a solid angle of a size equal to that of S, and of a small light 


ae 





source L’ similar to L at A; in order to compare the size of the two solid 
angles we put them side by side when they are both stationary in F. Let L 
and L’ coincide at the origin O of F at the time t = o, and let each source 
emit a light signal at the instant of coincidence. We wish to determine the 
space and time co-ordinates of four event particles e}, €, €a and eg. e, is the 
emission of the light signal by L, e; the emission of the light signal by L’. 
We obtain for the co-ordinates 
&: (0, 0, 0, o); e: (0, 0, 0, o). 

Event particle e; is the arrival of the light signal from L at B, event particle 
e, the arrival of the light signal from L’ at B’. Since AB is stationary in F 
while A’B’ is moving uniformly towards increasing values of x, the co- 
ordinates, as measured in F, are 


ex: (%, 0, 0, æje); ez: (x, 0, 0, t’), 
where 


x = f(x,o)tg E 
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and ` 


U=x/eo= Kesher > (2) 


From the linearity of T (with respect to spatial and temporal co-ordinates), 
the initial condition mentioned above and from some basic symmetry 





considerations it follows that! e 
x’ = ayxtaygt, 
y = Ay, (3) 
2’ = dg, 
t’ = AgX+aysl, 
(1), (2) and (3) yield | 
F(%, v) = go); (4) 
and hence . 
-x= E (etot B= ole (5) 
Pau glo) v 
en aa) - 


with the as yet undetermined function g(v). Next we consider a further 
pair of corresponding event particles, eg and e$. e; is the arrival of the light 
signal emitted by L at C, es the arrival of the light signal emitted by L’ 
at C”: 

es: (0, Y, o yje); eg: (wt, y’, 0, f). 


We have 
At)? = (y YHB. (7) 
From (3) we obtain . 
y = koy, (8) 
with an as yet unspecified function k(v). (7) and (8) yield 
reas _ Ae) (9) 
oS cV/i— 71 — pe" 9 
But from (6) we obtain for x = o andt = f, = yfc 
g0), 
=A” on 
Comparing (9) and (10) we find 
ie) = FO, (11) 
and ; 
ngo) 
n= BP (12) 


1 See e.g. von Laue, op. cit., vol. I, p. 54. 
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Finally, fet us determine the function g(v). For this purpose we consider a 
` system, S consisting of a machine and of a rod which is parallel to the 
Y-axis and to which the,machine imparts a speed v in the direction of the 
negative X-axis; the rod is assumed to remain parallel to the Y-axis. 
According to (12) the length of the moving rod is giyen by 
g Pa g(—v) ° I 

Ye (13) 
where y is its proper length. Now let us consider a second system S’, 
similar to S, which is moving uniformly towards increasing values of x. 
Equations (5) and (6) show that the speed u’ of the rod in S’ is connected 
with the speed u of the rod in S according to 


+ + t u } v . 
u =x jt = c. I 
° i 1-+-(uv)/c (14) 
Hence, since u = —v we have u’ = o; that is, the rod in S” is stationary 


in F and its length is y. On the other hand; the length of the rod must be 
given by 
; „_ 8&0) |, _ 8log) 














y= ae pe (15) 
Therefore we obtain 
a(o)e(—0) = 1B (16) 
But the (local) isotropy of space demands that 
ge) = g(—2) (17) 
which leads to 
: go) = vi— p. (18) 
From (5), (6), (12) and (18) we conclude that 
,_ +t 
x = yIī— BY (x9) 
, _ ttOo) 
t= 20 
y =y (21) 
and, since the directions of the Y- and Z-axes are equivalent, 
g =z. (22) 


That is, the transformation T which connects the co-ordinates of an event 
particle in S with the co-ordinates of the corresponding event particle in 
S’ is a Lorentz transformation. 

The above derivation of the Lorentz transformations, though somehow 
analogous to that given by Einstein, differs from it in two respects. First, 
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we have made no use of the principle of special relativity but have ‘used the 
principle of independence instead. Our procedure shows just how dis- 
pensable the principle of special relativity is for the derivation of the 
Lorentz transformations. Secondly, the transformations as interpreted 
above connect space and time co-ordinates all of which are measured in 
one and the same fundamental frame F, The difference from Einstein’s 
interpretation is perhaps best explained by considering the case of the 
addition theorem for the x-component of velocities: 
f w= u+o 
TL (uve) 

which Einstein interprets as follows. If a particle is moving with speed u 
in the direction of the X-axis of F (u being the speed measured in F), and 
if F’ is moving with speed +v relative to, and as judged from F in a 
direction parallel to the X-axis, then u’ is the speed of the particle as 
measured from F”. According to our interpretation, however, both u and 
u’ are measured in F; given a process which propagates in S with the speed u 
parallel to the X-axis, then the corresponding process in the similar 
system S” propagates with the speed u’ parallel to the X-axis (.S’ is supposed 
to move uniformly with speed +v parallel to the X-axis of F). 

Now, the experiments and observations which are generally considered 
to support the special theory of relativity consist of two main groups, 4 
and B. It is a common feature of the experiments of group A that similar 
systems S, S’, . . . are observed and compared in one and the same inertial 
frame, the laboratory frame, relative to which they are at rest or in uniform 
translatory motion (the times involved in these experiments is supposed to 
be sufficiently short, so that the diurnal and annual motions of the earth 
are negligible). The confirmation of the relativistic Doppler effect, of the 
dragging coefficient, and of clock retardation (by means of cosmic rays e.g.) 
are examples of this kind of experiment. Here the interpretation of the 
Lorentz transformations is precisely the one we have adopted above. The 
common characteristic of experiments belonging to group B is that similar 
events e, e’, €”, . . . are observed from various inertial frames F, F”, F”, ...: 
e from F, e' from F’, e” from F”.... Under this heading come the 
Michelson—Morley and the Kennedy~Thorndike experiments, also con- 
firmation of aberration effects. In Einstein’s theory, however, one and the 
same event particle e is assumed to be observed from various inertial 
frames F, F’, F’,..., a situation which is only indirectly linked to the 
situations considered in groups A and B. Frames of reference are, of 
course, themselyes physical systems. It follows from what has been said 
above that if space and time intervals are measured in accordance with 
Einstein’s symmetry convention, then the phenomena (apart from those 
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that are directly concerned with the substratum) appear to follow the same 
relativistic laws in all inertial frames. The Lorentz transformations form a 
group; hence if the transitions F > F’ and F «> F” are given by Lorentz 
transformations, then the same must hold for F’ -> F”. To take the simplest 
case, let S be a system which is stationary in F, S’ a system which is similar 
to S and stationary in another inertial frame F’ ‘similar to F. According to 
the principle of independence S behaves in relation to F exactly as S’ does 
in relation to F’. Thus, the Michelson—Morley experiment and akin null 
experiments can be immediately accounted for.” 


4 CONCLUSIONS AND CONJECTURES 


Lorentz’ ether theory of relativity had serious disadvantages when com- 
pared with Einstein’s special theory of relativity. It involved ad hoc 
hypotheses and lacked the lucid, general, axiomatic form of Einstein’s 
theory. The preceding considerations show that the principle of inde- 
pendence enables us to remedy these defects to some extent. We assume 
that there exists a fundamental frame of reference F in which the basic 
substratum (‘ether’) is at rest locally. By using the light principle and the 
principle of independence we show that the Lorentz transformations are 
valid in F and that consequently the well-known relativistic effects can be 
accounted for. 

One objection remains, however. As all attempts at identifying the basic 
substratum and the fundamental frame of reference F have so far failed, 
motion relative to the substratum remains an operationally empty concept. 
A substratum theory of relativity can thus be criticised for introducing 
differences (between frames of reference e.g.) which observation and 
experiment have failed to reveal. On the other hand, however, two equally 
forceful objections can be levelled at the special theory of relativity. First, 
observation and experiment have not shown that the speed of light is 
c == const. in all inertial frames. Although it seems to be possible, in 
principle, to test the invariance of the light principle without presupposing 
the symmetry convention and Einstein’s definition of simultaneity, such a 
test has so far not been carried out. All that can be said at present is that 
the invariance of the light principle is a hypothesis which is compatible 
with the experimental data we possess and which has, in a rather indirect 
way, led to predictions which have been confirmed. The second objection, 
already explained at length, is that Einstein’s theory gives rise to a causal 
anomaly: not only is it unable to give a causal explanation, but it also blocks 
the way to a substratum theory of relativity and hence to what appears to 
be the most attractive causal approach te relativistic phenomena. 
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Such a substratum theory of relativistic effects, while suggesting that 
the principle of relativity might not be as indispensable as many believe, 
appears on the other hand to be more in tune’ with some philosophic 
intentions behind Einstein’s general theory of relativity than is the special 
theory of relativity. “In accordance with classical mechanics and according 
to the special theory of relativity’, Einstein writes, ‘space (space-time) has 
an existence independent of matter or field’; ‘if we imagine matter and 
field to be removed, inertial-space or, more accurately, this space together 
with the associated time remains behind’.! Einstein made it clear that in the 
special theory of relativity space-time is absolute in the sense that it has a 
physical effect but is not itself influenced by physical conditions.* In the 
general theory, however, Einstein was guided by the idea of relational space 
and time: ‘On the basis of the general theory of relativity ... space as 
opposed to “what fills space”, which is dependent on the co-ordinates, has 
no separate existence. ... There is no such thing as an empty space, t.e. a 
space without field. Space-time does not claim existence on its own, but only 
as a structural quality of the field.’ And Einstein declares: ‘I wished to 
show that space-time is not necessarily something to which one can ascribe 
a separate existence, independently of the actual objects of physical reality. 
Physical objects are not in space, but these objects are spatially extended. 
In this way the concept “empty space” loses its meaning.’* Whether 
Einstein succeeded, in the general theory of relativity, in eliminating 
absolute space is, of course, another question. In his essay ‘Relativity and 
the Ether’ he attempted to eliminate absolute space-time by postulating 
some kind of substratum. But because he wished to retain the principle 
of relativity he could not admit that states of motion can be ascribed to 
the substratum, or components of it. Consequently, this ‘ether’ is unable 
to provide us with a sufficient basis for a causal account of the speed- 
dependent effects described by the special theory. A substratum, however, 
which (or parts of which) can be said to have states of motion in relation 
to material things, would make it possible to combine the idea of relational 
space-time with a causal approach to relativistic phenomena. 

Could a substratum theory of relativity make predictions over and above 
Einstein’s theory? This important question I must leave to professional 
scientists; I am unable to answer it except in a rather vague and general 
sense. Apart from identifying the substratum itself and its interactions 
with elementary particles, atoms and molecules, two additional possibilities 
might suggest themselves. The principle of independence indicates that 
gravitation and the effects special relativity deals with are more closely 


1 Einstein [1968], pp. 150, 154~5. . * Einstein [1967], p. 54- 
3 Einstein [1968], p. 155. 1 Ibid., p. iv (note to the 15th ed.). 
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linked than is generally assumed. For it is precisely a feature of the 
gravitational field that it affects all bodies in the same way (as long as tidal 
effects are negligible). It is true, its influence mfanifests itself as acceleration 
whereas the Lorentz transformations (as interpreted above) relate different 
stationary states of similar systems; but it is by means of accelerations that 
the transitions between the stationary states are brought about. The 
similarity between the two kinds of effects becomes more apparent when 
quantitative expressions are compared. Thus | 
T = T(1-+E mg?) 

yields approximately both the well-known relativistic retardation (if E is 
the kinetic energy of a clock with restmass m) and the gravitational 
retardation (if E is the potential energy of the clock in a gravitational field). 
The second possibility is largely concerned with cosmology: we may assume 
that the speed of light in a vacuum (or better: in what is commonly called 
a vacuum) depends upon the state of the basic substratum which in turn 
depends upon matter present in it. Let us imagine (to use Einstein’s 
famous example) a number of freely falling chests with observers inside, 
then observers in different chests may obtain different values for the speed 
of light, depending upon their location and the time of measurement. If, 
say, in earlier stages of cosmic evolution the state of the basic substratum 
was such that the speed of light was considerably larger than it is here and 
now, one would expect both considerable red-shifts and (because of 
E = mg") highly energetic nuclear processes in very distant cosmological 
objects. Whether these conjectures can in some form be substantiated 
remains to be seen. In any case, the time seems to have come to devote 
more attention to the idea of a basic substratum, an idea which has now 
been thoroughly neglected for over half a century. 


The City University, London 
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How to Establish Non-Conventional 
Isochrony 


by FERREL CHRISTENSEN 


Like verificationism, conventionalism begins—or generally does so—with 
this premise: If it is in principle impossible to learn empirically whether 
a given assertion is true, then there is no genuine fact to be asserted. 
Unlike the verificationist, however, the conventionalist does not conclude, 
concerning the sort of sentence that concerns him, that it is mere meaning- 
less nonsense, and hence neither true nor false? It may be meaningful, he 
says, but since there are no facts about the world to make it true or false, 
its truth-value can be stipulated freely. It is made true or false simply by 
convention. Such is often claimed to be the case in general with geo- 
chronometrical statements—for example, a statement to the effect that 
some pair of temporally separated time intervals are congruent. After all, 
we can not move one of them through time and bring it into coincidence 
with the other for comparison, so how can we know whether or not they 
are of equal duration? And if we can not know it, the conclusion runs, then 
no statement asserting it or denying it is either true or false—unless, once 
again, we make it so by sheer arbitrary stipulation. (See Reichenbach 
[1958], pp. 113-19.) 

From this it follows that no clock—z.e. a physical system undergoing 
cyclic change—is ‘factually’ (non-conventionally) isochronous. We can 
discover empirically whether two periodic processes are ‘equivalent’; that 
is, whether every m cycles of one coincide in time with exactly n cycles of 
the other. But we have no way of knowing whether either of them is (to use 
Carnap’s phrase) strongly periodic, having cycles of equal length. Says 
Carnap, ‘We cannot know that a process is periodic in the strong sense 
unless we already have a method for determining equal intervals of time! 
It is precisely such a method that we are trying to establish by our rules. 
How can we escape this vicious circle? And he concludes that brute fiat 
is the only way out. (Carnap [1966], pp. 80-85.) 

Now, it is widely accepted these days that the Verifiability Principle 
must, if it is to be saved at all, be understood in the weak sense that is 
satisfied by the possibility of inductive (or ampliative) evidence. And I 
should think that this goes for conventionalist views as well. But believers 
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in metrical conventionalism have held that in fact not even inductive 
reasons can be given in support of a claim of non-conventional isochrony. 
(Actually, at least one tonventionalist, Adolf Griinbaum ([1963], part I), 
claims not to base his belief on any sort of verifiability considerations. But 
even he, of course, denies the possibility of genuine evidence for any such 
metric in reality.) For ‘instance, it is denied that the relative simplicity or 
complexity of whatever physical laws may result from a given choice of 
clock can be used to argue that the clock is or is not isochronous. After all, 
how do we know that all the periodic processes’ of nature that are normally 
taken to be strongly periodic (e.g. atomic vibrations) are not all speeding 
up or slowing down, or even fluctuating wildly, in unison? For if they are, 
then the simplest laws are those based on this large class of non- 
isochronous clocks. i f 

However this may be, it has occurred to me that there is a procedure 
which can in principle be used to determine inductively whether or not a 
process is (non-conventionally) ‘strongly periodic’, a procedure that makes 
no appeal to any particular scientific laws. The purpose of this brief paper 
‘is to describe that procedure and examine its consequences. For the record, 
let me say parenthetically that I am opposed to both sorts of attempts to 
link truth-value with verifiability, and moreover that I object to metrical 
conventionalism on the grounds that the meaning or intension of an 
expression is alone sufficient to determine its truth-value or extension. 
I can not pursue these wider issues here. But if I have found a method that 
makes it possible to get inductive evidence for or against statements that 
assert (an-)isochrony, so much the better, for I will have met these 
philosophers on their own grounds. 

Consider the diagrams below. 


A A A A A 
A A A A A 
B B B B B B BB 
B B B B B B BB 
(Fie. 1) 
A A A A A 
A A i A A A 
(Fia. 2) 
B B B B B B BB 
B B B B B B BB 


How to Establish Non-Conventional Isochrony 51 


Let each line of letters represent some cyclic process, the letters them- 
selves standing for an identifiable repeated phase in each. Let a graph in 
which the letters are equally spaced represent a strongly periodic process. 
(It will be simplest, at this point, to talk as if isochrony is factual, in order 
to determine whether: any empirical consequences follow.) Then Fig. 1 
illustrates two pairs of what Carnap calls ‘equivalent’ processes, where in 
one pair both are strongly periodic, and in the other pair both are 
accelerating at the same rate. But once again, how could we know all this? 
We can tell that the B-processes are accelerating relative to the A-processes 
—or, to say precisely the same thing in a different way, that the A- 
processes are decelerating relative to the B-processes. And we can tell that 
the two A-processes are uniform (no acceleration or deceleration) relative 
to one another; similarly for the B-processes. But how can we know whether 
any one process is uniform or accelerating simpliciter? To answer this, 
notice that if the points on two graphs are evenly spaced, then shifting one 
of them relative to the other will leave them still ‘equivalent’, with the 
game ratio (m/n) of cycles of one to cycles of the other as before. (See 
Fig. 2. Here m/n has been illustrated as 1: 1 for simplicity.) But if the 
spacing is not even, the shift will in general destroy their equivalence 
(as in Fig. 3), or at least result in a different ratio of equivalence. (Except 
in special cases, that is, such as that of sinusoidally varying patterns moved 
by exactly 360 degrees. But these will be discovered by repeated shifts at 
random intervals.) All this suggests the following way of finding out 
whether a process is in fact slowing down, remaining constant, or whatever: 
if we could somehow shift one of a pair of equivalent processes in time, 
they would in general remain equivalent in the same ratio only if both were 
strongly periodic! 

Of course, we can not literally move a particular individual process from 
one time to another, but perhaps under the right circumstances we can do 
the following: shift the pattern of the process to a different time than that 
at which it would otherwise have appeared, by simply stopping the process 
and subsequently re-starting it. For a stopping and starting procedure may 
have the effect of merely interrupting, without otherwise altering, the 
course of change in the physical system—+.e. not affecting the rate of 
speeding up or slowing down, or whatever, except to delay it to the later 
time. (Examples: pull and replace the plug on an electric clock, catch a 
pendulum at its highest point and later release it. Incidentally, we might 
also arrange that in each pair of physical systems, one is structurally 
identical to the other—identical wound springs or pendulums, for example. 
This is not essential to the procedure, but ‘would strengthen the force of 
the conclusions.) s 
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To be sure, we have no way of knowing a priori that the stopping and 
starting of a process will in a given case have no other effect than that of 
delaying the pattern of change. For example, a pendulum stopped else- 
where than at the apex of its swing, when judged according to our present 
‘standard’ temporal metric, will not merely be delayed—its periods will 
be slightly shortened a8 well. But I suggest that the outcome of such a test 
will itself often provide evidence that there was, in the case in question, 
no other effect. That is, if a certain process, on being re-started, continues 
on doing just the same sort of thing it had beef doing when it was halted, 
then certain empirical results will ensue; so the actual observation of those 
results will confirm the hypothesis that that is indeed all that has happened. 
In fact, even in cases where the pattern is altered and not just delayed, 
there should in general be ways to isolate out the effect of the stopping 
procedure—say, by keeping that procedure constant while varying the 
length of the delay. But for simplicity I will ignore this latter possibility. 

To illustrate all this now, note that Fig. 2 represents experimental 
results confirming the hypothesis that the A-clocks are isochronous—and 
‘thereby disconfirming the hypothesis that the B-clocks are isochronous. 
That is, we find that no matter how often we stop and re-start either 
A-clock, they remain equivalent in the same (1: 1) ratio. Figure 3 repre- 
sents results that further disconfirm the hypothesis that the B-clocks were 
isochronous: here we find that when one (but not both) is stopped and 
started, they do cease to be equivalent. And still further evidence for the 
same conclusions will result if we observe the following: whenever the 
period-ratio between an A-process and a B-process is m/n and increasing 
when the latter is stopped (or both are stopped), it is still m/n and in- 
creasing at the same rate after re-starting; but whenever the A-clock alone 
is stopped, the period-ratio on re-starting is greater than m/n and changing 
at a different rate. All this is predicted, once again, because shifting an 
accelerating pattern results in there being a different rate of change at each 
time, while a uniform pattern is already the same at all times and hence is 
not affected by a shift. So I hope it is clear that the hypothesis that the 
A-clocks are isochronous and the B-clocks are speeding up has very 
different empirical consequences than alternative hypotheses. If we should 
repeatedly get agreement among these different tests, surely, we would 
have good inductive grounds for holding that hypothesis. 

At this point it may still be objected that such experimental outcomes 
might be partially the result of the stopping and starting, and hence not 
indicative of (non-conventional) uniform or accelerated change after all. 
It seems to me that this possibility can be ruled out as highly implausible, 
in appropriate cases, by the quantitative consequences of this sort of test. 
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To illustrate this point with an example that is especially simple, suppose 
we discover that the B-clocks are speeding up exponentially relative to the 
A-clocks, and (having already concluded that the former are isochronous) 
we infer that the B-processes are accelerating exponentially simpliciter. 
More precisely, the duration of a period for the two B-clocks is inferred 
to be given by the general formulas k,e~%'- ard kye~“-), Then by 
simple computation, the period-ratio for two such processes is given by 
mjn = (R,/Ry)e5-. But this is a constant, hence, as required, the two 
clocks will be equivalent at*this ratio as long as neither is tampered with. 
But should one of them be stopped for a period of time At (measurable on 
_ one of the A-clocks), we find with another simple calculation that m/n 

should have a new value, different from the old one by a factor of e, 
(The ratio will be constant throughout any period in which neither clock 
is disturbed, but it will have a different value after each interruption. This 
is a convenient result that obtains only for certain special patterns of 
process acceleration.) Thus, we can predict in advance the amount by 
which the empirical quantity m/n will change. And we can make still 
other testable predictions, such as that subsequently stopping the other’ 
B-clock for the same period of time will result in their returning to the 
original ratio of equivalence. 

Parenthetically, we can verify the hypothesis that the B-clocks are 
accelerating exponentially without measuring the period of interruption 
on an A-clock. The change in m/n, computed above as a function of 4t— 
where At was of course assumed in computation to involve congruent time 
units—can also be calculated as a function of the periods of an ‘exponential’ 
clock, though the mathematics is more complex. (One result is that. m/n 
becomes a function of the time at which each stopping takes place, as well 
as of the duration of the delay.) Consequently, even in a world without any 
naturally-occurring strongly periodic processes, we might still be able to 
infer how their rate would change relative to the natural ‘clocks’, and 
perhaps even construct them. 

Now to draw out the moral of all this. If, for given physical systems 
undergoing cyclic change, all such predictions should be borne out, again 
and again, I should think we would have good reasons for deciding that cer- 
tain ones of them are genuinely isochronous and that others are factually 
anisochropnous. It is difficult to see how any alternative hypothesis could 
account equally well for the sort of empirical observations here envisioned. 
Of course, there is no a priort guarantee of living in a world where there 
are processes which will yield any inductively exploitable patterns at all 
when repeatedly stopped and started—much less the highly suggestive 
patterns that I have described here. But surely, as long as it is in principle 
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The Eliminability of Masses and Forces 
in Newtonian Particle Mechanics: Suppes 
Reconsidered * 


by JON DORLING 


In his well known axiomatisation of classical particle mechanics (McKinsey, 
Sugar and Suppes [1953]; Suppes [1957]), Suppes was able to prove that 
masses and forces are independent primitive notions which cannot be 
defined on the basis of the other primitives. This result is normally taken 
to show that, contrary to Mach’s positivist views, masses and forces are 
theoretical terms which cannot be eliminated in favour of observational 
terms. Taken at face value this conclusion seems also to provide strong 
support for hypothetico-deductivism as against inductivism. 

However it is difficult to accept Suppes’s easily proved results at their 
philosophical face value. On the one hand (contrary to what most philo- 
sophers seem to suppose) mathematical physicists seem, in general, to have 
succeeded in eliminating theoretical terms in favour of more directly 
observational terms, in most other theories, whenever they have seriously 
set out to do so: for example everyone knows how to eliminate vector 
potentials from classical electrodynamics without loss; it is trickier in the 
case of quantum electrodynamics, but Mandelstam showed how to do it 
should we so choose; Wheeler and Feynman, following Tetrode and others, 
deliberately set out to eliminate even the fields from classical electro- 
dynamics and in this they succeeded; the distinguished present-day 
positivist Robin Giles has published elegant formulations of classical and 
relativistic thermodynamics (Giles [1964]) and of quantum mechanics 
(Giles [1970]) in precisely defined observational languages in which the 
usual theoretical terms are later reintroduced by explicit definition. It 
would be surprising if classical mechanics proved to be an exception, On 
the other hand Suppes’s choice of observational primitives is sufficiently 
idiosyncratic to invite suspicion. 

G. W. Mackey, in his [1963], presented in the first few pages an 
axiomatisation of classical particle mechanics which is very different from 
that of Suppes. Suppes took the position of each particle as given as a 
function of the time only. From this we can calculate the acceleration of 
each particle as a function of the time only. Mackey, on the other hand, 
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assumes'that we are given the acceleration of each particle as a function of 
its position and of the positions of the other particles. Mackey is then able 
to formulate Newton’s third law as a condition on the partial derivatives of 
these acceleration functions, and is then able to define mass ratios as the 
inverse ratios of the appropriate partial derivatives. His definition is 
formally a generalisation of Mach’s definition’of mass ratios. However 
Mach’s definition, to be applicable, required highly idealised, and 
typically unrealisable, experimental conditions, whereas Mackey’s enables 
us immediately to compute the mass ratios from the given acceleration 
functions. In his next paragraph (we are still only on p. 4 of Mackey’s 
book) he is able explicitly to define the (total) force acting on each particle. 
It is true that without further conditions on the acceleration functions 
Mackey cannot go on to define the component force due to each sdurce 
particle, but with further cénditions on the acceleration functions (condi- 
tions corresponding to the requirement of central forces) there seems no 
difficulty in doing this as well. Suppes’s results are thus simply not 
applicable to Mackey’s type of axiomatisation. They are peculiar to 
Suppes’s type of axiomatisation. 

Now, in Suppes’s defence, one might attempt to reply that direct 
observation only gives us the positions and accelerations of particles as 
functions of the time and not as functions of the positions of other particles. 
One might argue for this contention on the grounds that the latter alter- 
native (Mackey’s) licenses, inter alia, counter-factual predictions, whereas 
the Suppes alternative does not. This would seem at first to suggest that 
Suppes’s primitive function is more directly observational. This contention 
seems to me to rest on a mistake. All we actually observe is the value of the 
function (the position or acceleration of a given particle) for certain values 
of its argument or arguments (the time, or the positions of the other 
particles). The step from this finite set of data to the values of the function 
for all possible values of its argument, or arguments, involves extrapolation 
arid interpolation which goes beyond the data and which involves guessing 
the values of the function for intermediate values of the arguments and for 
valués outside the range, of values of the arguments, included in the 
observations. Now I claim that any reasonable methodological principles— 
inductive, hypothetico-deductive, or what have you—which will license 
this extrapolation and interpolation in the one case will license it in the 
other. For unless it happens that the time is the only independent variable, 
and hence that we can be sure that all possible values of the independent 
variables will be realised at some time or other, extrapolation and inter- 
polation will characteristically yield some predictions which are empty in 
the sense that the conditions for testing them have never arisen and will 
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never arise, in other words it will yield some counter-factual predictions. 
But nobody—be he inductivist, hypothetico-deductivist, or what have 
you—would seriously propose as a reasonable methodological principle for 
interpolating and extrapolating from experimental points on a graph, a 
principle which was -restricted by fiat to the case where the abscissa 
represented the time and which explicitly disallowed similar interpolation 
and extrapolation in virtually any other case. I conclude that from the 
point of view of any reasonable positivist epistemology there can be no 
objection to Mackey’s choice of observational primitives and no reason for 
preferring Suppes’s much more idiosyncratic choice. 

The claim that the usual theoretical primitives of classical particle 
mechanics cannot be eliminated in favour of observational primitives seems 
therefore not only not to have been established by Suppes’s results, but to 
be definitely controverted in the case of more orthodox axiomatisations 
such as Mackey’s. Suppes’s results, therefore, can be no weapon against 
the claims of positivists and inductivists. 


University of London, 
Chelsea College 
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Discussion 


PUTNAM AND WILKES AND MIND AND BODY** 
e 


INTRODUCTION 


Putnam’s ingenious discussions of the mind-body problem have recently come 
under attack in this Journal from Yorick Wilks.1 However, as I shall show, 
Wilks’s criticisms of Putnam are invalid.? 


PUTNAM’S ARGUMENT 


Putnam claims that a problem analogous to the mind-body problem arises for 
some types of Turing Machine and that none of the current arguments in favour 
of dualism is able to establish any fundamental difference with regard to the 
existence of a mind between humans and these machines. Putnam’s arguments 
are, briefly, as follows.* A Turing Machine is completely described by a machine 
table which constitutes its program. The machine may be in various ‘states’, 
labelled A, B,.... It has a tape divided up into separate boxes in each of which 
a symbol may be printed which the machine is capable of ‘scanning’. The items 
appearing in the machine table are all instructions of the form: if the TM is in 
state A and the symbol it is scanning is s,, then replace the scanned symbol by 
$s, or Move on to scan the adjacent left-hand (or right-hand) box and shift into 
state C. 

The machine proposed as an analogue for humans has a mechanism for 
printing out ‘I am in state A’ (or some equivalent) whenever it is in state A. 
It is also provided with sensors and a program which enables it to make fallible 
conjectures about its own structure of the sort: 


(i) ‘I am in state A iff flip-flop 36 is on’. 
Putnam compares the status of this machine utterance with the human statement: 
(#) ‘I am in pain iff my C-fibres are stimulated’. 
Some philosophers have argued that ‘I am in pain’ and ‘my C-fibres are 
stimulated’ have different meanings, since they are differently verified and that 
hence (#) is non-analytic. They also claim that there is a crucial difference in 
status between these two statements, a difference which derives from the fact 
that being in pain has a directly observable and incorrigible character while 
statements about C-fibres are fallible. On the basis of these claims, these 
philosophers have asserted that human beings possess a non-physical mind. 


* I should like to thank Peter Clark, Gregory Currie and Colin Howson for their critical 
reading of an earlier draft of this paper. Any errors are, of course, my own. 

1 Wilks [1975]. 

* In the course of his paper, Wilks also rebuts earlier criticisms of Putnam by Clarke. This 
rebuttal seems to me entirely successful, 

3 For a more detailed exposition see Patnam [1961] and [1964]. 
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Putnani’s main claim is that analogous arguments can be constructed in 
connection with the machine utterance (i) and that they can be used in an 
analogous way to support a dualism for TMs. (His concern is, of course, to show 
the poverty of existing arguments for a mind-body duality in humans rather than 
to establish that TMs have minds.) 

First, (i) is non-analytic, since ‘being in state A’ and ‘flip-flop 36 being on’ 
have different meanings and whether some TM has these properties is verified in 
different ways. Secondly there is an important difference in status for the 
machine between the statements ‘I am in-state A’ and ‘flip-flop 36 is on’ 
since, although one can describe how the TM knows its flip-flop 36 is on by 
describing the sequence of states it passes through in making the relevant 
computation and the appearance of the tape at each stage in the computation, 
there is no sequence of states which a TM goes through, no computation which 
it makes, in order to be in some state. Just as a human knows he is in pain by 
being in pain, so the Turing Machine knows it is in state A by being in state A. 


. 
WILKS’S CRITICISM 


Wilks claims that Putnam’s argument contains ‘a straight-forward sleight-of- 
hand trick with the notion of “a TM is in state A” ’1 The notion of a ‘state’ is 
introduced in connection with abstract TMs but is retained as an abstract notion 
even when discussing real TMs. This Wilks regards as illegitimate, since ‘the 
very process of setting up a real, as opposed to an abstract, TM must have made 
the notion “state” concrete too’.® For Wilks, ‘being in state A’ and ‘having some 
particular physical structure’ are synonymous for a real TM—that is, some 
physical state of the machine defines what it is for the machine to be in-state A 
and being in that state is then nothing other than being in the corresponding 
physical state. When the machine asserts ‘I am in state A’, it is asserting 
nothing more than ‘I am in such-and-such physical state’. 
- Wilks then argues that one can describe how the concrete TM ascertains that 
it is in*state A simply by specifying certain physical processes preceding the 
printing of ‘I am in state A’. Moreover, the way in which the TM ascertains 
that its 36th flip-flop is on can be similarly described in purely concrete, physical 
terms: in both explanations, talk of ‘states’ is completely otiose. Since there is 
no distinction between logical and. physical states, the distinction, vital for 
Putnam’s thesis, between the two questions ‘how did the TM know it was in 
gtate A?’ and ‘how did the TM know that flip-flop 36 is on?’ thus disappears, 
as does the analogy with human beings. 

(I must admit that I have had some difficulty in interpreting what Wilks says 
in this connection and hope I have captured his intent, and perhaps strengthened 
his argument. Wilks expresses his point as follows: 


‘1. a8 soon as the TM becomes‘ concrete, a different situation arises, even 
if the concreteness is provided only by pieces of cardboard and a reading 
and moving and writing person, who carries out the above instructions by 
hand in a ‘mechanical’ way (the nature of the person, or that it is a person, 
is of. no importance here). 


1 Wilks, ibid., p. 215. - bs 2 Ibid., p. 216. 
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Consider such a pencil and card machine in which the current state is 
indicated by a pack of cards with upper case letters on them, A, B, ete. s0 
that the state at any time is indicated by the card face uppermost on top of 
the pack. Notice that. the current state must be indicated somehow: it 
cannot remain an undefined notion or we cannot proceed with the game. 

Nothing could be sippler than this toy, but real, TM. When the ro 
reads the machine table item above, ‘if the machine is in state A, and. 
he looks at the top of the pack to see if an A is showing, if it is he goes on, 
if not he looks at the next machine table item, and acts in an appropriate way. 

It is a perfectly clear question ‘how does this TM know it is in state A?’ 
and the answer is ‘by looking at the top card to see if A is there’, Whenever 
the TM becomes real, some such definition of ‘state A’ must be given, so 
that ‘state A’ then becomes a derivative notion. It may remain in what 
Putnam calls the ‘logicians’ description’ but only by courtesy, as it were, 
for it has no function now that a real TM hes been defined. This is the 
crux of the matter.} 


In making his main point the claim that ‘it is a perfectly clear question “how 
does this TM know that it is in state A?” ’, Wilks seems to lay great stress in 
Putnam’s rhetorical assertion that this question is ‘silly’ and ‘does not make 
sense’. Wilks emphasises these phrases in his opening exposition of Putnam’s 
thesis and in ignoring their rhetorical character Wilks gives a much too super- 
ficial and misleading account of Putnam’s case. The crux of this case is that 
there is a difference between the questions ‘how does the TM know that it is in 
state A? and ‘how does the TM know that flip-flop 36 is on?’ which is 
analogous to the difference between the questions ‘how does Peter know he is in 
pain?’ and ‘how does Peter know his C-fibres are stimulated?’ 

‘Thus in order to break Putnam’s analogy, Wilks has to argue that the putative 
difference between the two questions does not exist; one way of doing this would 
be by granting Wilks’s often repeated claim that, for real TM’s the notion of 
state is completely exhausted by some given concrete setup.) 


PUTNAM VINDICATED 


Wilks’s claims appear to be based on an extreme nominalism which denies that 
concrete objects have any general properties. This is presumably why he claims 
that 


the very process of setting up a real, as opposed to an abstract, TM must 
have made the notion of ‘state’ concrete too.? 


Consider the following further quotations from Wilks’s paper which evidence 
the nominalist foundations of his case: 


Whenever a TM becomes real, some such definition [in terms of a physical 
configuration] of ‘state A’ must be given, so that ‘state A’ then becomes a 
derivative notion. It may remain in what Putnam calls the ‘logicians’ 
description’ but only by courtesy, as it were, for it has no function now 
that a real TM has been defined. This is the crux of the matter. Conversely, 


1 Ibid., p. 216. j 3 Ibid. 
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if no such definition, or procedure, has been given, then we still have only 
the abstract TM and the real TM cannot run. 

Putnam’s claim that ‘Iam in state A’ and ‘flip-flop 36 is on’ are ‘clearly non- 
synonymous in the machine’s language by any test [since] the machine has 
to use different methods of verification...’ is simply and straightforwardly 
false for any real TM* 

Since the [given] machine is real and not abstract, we can just demand that 
he [Putnam] tell us what it is to be in state A, and once told, his case fails, 
for state A then has no unexhausted content.3 


Now Wilks’s nominalism is clearly false. Consider, for example, the general 

property: ‘weighing ten grammes’. This property can be realised in a large 
number of concrete situations, such as this thimbleful of oil and that bucketful 
of feathers. When so realised, it is not the case that ‘being this particular 
thimbleful of oil’ and ‘weighing ten grammes’ are synonymous, nor that the 
general property has become „a concrete, or particular property. That this is so 
is clear from the fact that there are logical consequences of the statement ‘X 
weighs ten grammes’ which are not consequences of the statement ‘X is this 
thimbleful of oil’, for example, the consequence “X weighs more than five 
grammes’. 
. We may thus discuss the weight of an object without having to regard ‘weight’ 
as a façon de parler for some particular collection of atoms. Similar considerations 
apply to other general properties such as ‘having so much energy’, ‘being 
square’ and ‘being propositional’. Now ‘being in state A’ is a general, complex 
dispositional property whose exact dispositional character is determined by the 
machine table in which it is embedded. For example, state A may be the property 
which determines any machine in that state to overprint sẹ and go into state B 
whenever symbol s,, say, is scanned. ‘State B’ can be similarly decomposed and 
the analysis of ‘state A’ is complete when the only state mentioned in its analysis 
is the halting or final state. The dispositional property of being in state A is no 
more to be identified with some configuration of the parts of a real TM than is 
the weight of that machine to be identified with its arrangement of atoms. If one 
were to admit Wilks’s point and identify the logical state of a machine with some 
configuration of its parts, then no machine operating with vacuum tubes, say, 
could be in the same logical state as a machine which employs magnetic drums— 
nor, mutatis mutandis, could the two machines have the same weight! 

I conclude that, contrary to Wilks’s claim, the logical state of a machine has a 
meaning quite distinct from that of a description of the particular physical setup 
which instantiates that state. It follows that the statement ‘I am in state A iff 
flip-flop 36 is on’ is synthetic and that, as Putnam asserts, a TM ascertains the 
condition of its flip-flops by computation (that is, by changing its logical state), 
while no change in logical state, no computation, is required in determining its 
current logical state. Since the distinction between a TM’s logical states, and its 
physical states does—contra Wilks—exist for real machines, there is the difference 
between the questions ‘how does the TM know it is in state A?’ and ‘how does 
the TM know that flip-flop 36 is on?’, which Putnam points out. 


1 Ibid., pp. 216-17; my italics. | 
2 Ibid., p. 271; my itglics. 
3 Ibid., p. 217; my italics. 
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The analogy between a TM’s direct knowledge of its logical state and a person’s 
direct knowledge of his mental state and between a TMs and a human’s indirect 
knowledge of their physical states is thus preserved against Wilks’s attack. 


: PETER URBACH 
. London School of Economics 
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Review Articles ; 


WAVE-PARTICLE DUALITY* 


e 

x The central philosophical puzzle posed by the quantum theory is what 
attitude should one adopt to the apparently mysterious dual nature exhibited in 
the microphysics of matter and radiation, for example that radiation behaves in 
some respects—interference, diffraction, ete.—like a wave process governed by 
the classical Maxwell equations of an electromagnetic field, and in other respects 
—photoelectric effect, Compton scattering, ete.—like a beam of particles, the 
so-called quanta or photons. The bewilderment of physicists in this paradoxical 
situation was captured by Sir William Bragg in his famous remark, “We use the 
classical theory on Mondays, Wednesdays, and Fridays and the quantum theory 
on Tuesdays, Thursdays and Saturdays’. A reciprécal duality for material sub- 
atomic particles emerged in the discovery of electron diffraction and other 
wave-like phenomena. 

There have been three main approaches to trying to come to terms with the 
duality problem. The first approach is simply to deny that quantum mechanics, 
QM, is basically mysterious. The slogan here is that, despite appearances, a” 
particle is a particle is a particle, and mutatis mutandis neither need we equivocate 
about waves. On this view QM is really a sort of glorified statistical or perhaps 
more correctly stochastic mechanics. Exponents of this general approach would 
include Popper, Landé and Fine, and it is now vigorously defended by Professor 
Audi in his book The Interpretation of Quantum Mechanics. This approach can 
be extended to an attempt to reinstate determinism at the level of some hidden 
infrastructure, but the question of whether a hidden variable theory is intrin- 
sically probabilistic or deterministic is quite a separate issue from whether in 
reality microsystems can be described as associated with the familiar classical 
variables, for example an electron can be regarded as possessing simultaneous 
position and momentum and tracing a classically describable trajectory in passing 
through a two-slit system designed to reveal electron interference phenomena. 
In particular Audi wants to defend an indeterministic particle interpretation of 
the electron. We shall consider his detailed arguments and the general difficulties 
associated with this approach in a moment. 

The other two approaches to duality agree in admitting that QM is mysterious 
in the sense that we cannot hope to understand the paradox in terms of purely 
classical physics, but they bifurcate in the following way. For Bohr the descrip- 
tion of quantal phenomena must be in terms of classical concepts, for example 
of wave and particle, but the new feature of QM is a limitation on the applic- 
ability of these concepts expressed in the.principle of complementarity. To 
paraphrase the Bragg quotation it is all right to use contradictory concepts on 
Mondays and Tuesdays—a logical contradiction would only result if we were 
compelled to employ both concepts on a Monday, but happily we are never in 
* Review of Audi, M. [1973]: The Interpretation of Quantum Mechanics. Chicago-London: 

University of Chicago Press. Pp. xiv-+200; and Jauch, J. M. [1973]: Are Quanta Real? 
—A Galilean Dialogue. Bloomington-Eondon: Indiana University Press. Pp. xii+ x106. 
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the situdtion of having to do this, as each experimental arrangement dictates the 
appropriate concept to employ. The third approach we want to distinguish, 
which’ we may arguably! associate with the name of Heisenberg, is that OM is 
mysterious in that it requfres, not merely a limitatton on the use of old concepts, 
but quite new concepts. Reality on this view is not a wave or a particle in com- 
plementary situations but rather transcends and unites both these concepts. The 
new concepts may actually turn out to be old ones, older than those of Newtonian 
physics that is, and indeed Heisenberg’s ideas on potentiality and Margenau’s 
latency theory hark back to Aristotelian physics. At the formal levél the Dirac- 
Jordan—Klein-Wigner second quantisation formalism was thought by Heisenberg 
to express the unification of wave and particle views—reality is neither a wave 
nor a particle, but a quantised field. Audi devotes the final chapter of his book to 
an attempt to demolish this general approach to quantum field theory. 

. Let me now say a word about Jauch’s book Are Quanta Real?* He has had the 
happy idea of displaying the arguments about the true nature of quanta in the 
form of a Galilean dialogues The familiar characters of Salviati, Sagredo and 
Simplicio are met again to discuss the opposing views in this modern con- 
troversy. Salviati now expounds the new ideas of Bohr and the principle of 
complementarity, while the lumbering Simplicio defends the classical notion of 
microsystems as possessing real existence independent of observation and subject 
to deterministic laws of time evolution. Sagredo plays his familiar role of 
impartial observer, who allows himself to be persuaded at all stages by the 
‘superior’ arguments of Salviati. So just as in the Copernicus-Ptolemy debate in 
the original dialogue everything is carefully loaded to convince the reader of the 
necessity for the new approach. 


2 Having adumbrated the broad perspectives in terms of which the books by 
Audi and Jauch may be set in context, let us turn to some of the detailed 
arguments. 

In Jauch’s book the point is well stressed that the ultimate basis for the disputes 
about‘the interpretation of QM is ideological. If one finds a consistent deter- 
ministic particle interpretation, for example, the source of one’s satisfaction is 
really no more than an irrational prejudice in favour of explanations in the style 
of those provided by classical physics. The question of whether adherence to 
such a prejudice should outweigh any concomitant difficulties in such an inter- 
pretation is not something that can be decided by rational argument. What can 
be decided is just what these concomitant difficulties are. Both Jauch (in the 
guise of Salviati) and Audi attack deterministic hidden variable theories. Jauch 
says that if we allow any number of hidden variables with sufficiently peculiar 
properties we can give a deterministic explanation for any occurrences, so such 
a weak theory would have no predictive value and hence be ‘no theory at all’, 
Audi makes the point that hidden variable theories are more complex in the sense 
of assuming more entities, the hidden variables, and claims that if they give the 
same successful predictions as indeterministic particle theories, these predictions 





1 Tt would be easy to cite quotations from Heisenberg explicitly contradicting this approach, 

: But we are not concerned here with. the internal inconsistencies in Heisenberg’s writings 
on QM. ° 

2 This was the last book published by Professor Jauch before his. death in 1974. 
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best support the simpler theory. On the other hand if they predict new pheno- 

mena, then on present evidence such predictions have not been confirmed. So 

either type of hidden variable theory is to be rejected. On Jauch’s point my 

comment would be that such £ very weak hidden variable theory would certainly 
be uninteresting. What does become an interesting question is whether if we 
impose certain restrictions op the hidden variables, then can we give a hidden 
variable interpretation. This is the status of the so-càlled no-hidden-variable 

proofs of von Neumann, Kochen and Specker, Bell, etc. Of course one can always 

get round these no-hidden-variable proofs by relaxing the restrictions—then our 
arguments must turn on whether the conditions we are obliged to relax are ones 

we feel uneasy about giving up. An important point here is the relationship 

between intrinsic properties such as position and momentum of an electron and 

the results of measurement of such properties. One move to circumvent no- 

hidden-variable proofs is to say that restrictions that may be plausible for results 

of measurement need not be expected to apply to intrinsic properties if these are 
disconnected from the results of measurement by uncontrollably indeterministic 

measurement interactions. This appears to be Audi’s position, but as Dorling 

has stressed in his [1975] such an approach makes the realistic interpretation 

wildly metaphysical, if measurement results have no bearing on intrinsic 
properties. Elsewhere Audi does want to claim that measurements can retrodict 
intrinsic properties, and would apparently support Popper’s [1967] view that the- 
possibility of retrodiction is essential to testing a statistically interpreted OM. 

There seems to be a basic inconsistency in Audi’s views here. On Audi’s argu- 

ment that predictions lend less empirical support to a complex theory, this must 
depend on the initial plausibility of the theory which for ideological reasons 
might be greater for the complex theory. Audi’s understanding of confirmation 

theory seems to falter on this point. 

Both Jauch and Audi want to maintain that QM is intrinsically indeterministic. 
Jauch argues, following Born for example, that in this respect there is no break 
with classical mechanics. Classical trajectories may be unstable with respect to 
small variations in initial conditions, leading to effective indeterminacy. The 
force of this argument is not to show that OM is after all reducible to classical 
mechanics but that determinacy is not an entrenched feature of purely classical 
explanations—hence there should be no undue resistance to admitting a more 
pervasive incidence of indeterminacy in a physical theory. Audi gives a quite 
different argument. He says that in QM it is characteristic that if at some initial 
time we have certain knowledge of some property of a microsystem, i.e. if the 
state of the system is initially an eigenstate of the associated operator, then in 
general our knowledge will ‘degenerate’ as time progresses into purely 
probabilistic knowledge. The point that Audi misses here is that this degenera- 
tion of knowledge in respect of one feature is always compensated in QM by a 
corresponding sharpening of knowledge concerning some other feature—or to 
put the point more precisely we always have certain knowledge of the observable 
associated with the projection operator whose range is the instantaneous state 
vector of the system. In this respect QM displays an unusual reciprocal con- 
centration and dissipation of probabilities as time proceeds. This is the feature 
that actually distinguishes the situation in QM-from that obtaining in classical 
statistical mechanics where an initial, concentration of probability does become 
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progressively dissipated as time progresses at any rate for times not comparable 
with Poincaré recurrences. 

Audi attacks as mere equivocation the contention that QM is deterministic 
merely on the grounds that the time evolution df the state vector is uniquely 
determined by the Schrödinger equation. While maintaining that QM is inde- 
terministic Audi does not want to conclude that the theory violates a principle 
of causality. The distinction he draws here is that while every effect may have a 
cause, in the sense that later physical states arise via transitions from earlier ones 
and are due to disturbances capable of producing them, nevertheless it is not the 
case that identical causes produce identical effects. By a disturbance capable of 
producing a given transition Audi says he means that energy and momentum 
conservation laws are satisfied. It is not clear to me why other conservation laws 
should apparently be excluded from determining the possible range of outcomes 
of a disturbing interaction if the transitions are to be described as causal. The 
privileged emphasis on energy and momentum suggests that Audi’s distinction 
between causality and determinism is based on a muddled view of the status of 
conservation laws in classical mechanics. 


3 Let me now turn to the central argument that Audi employs for claiming the 
naturalness and indeed, if I understand him correctly, the necessity of his particle 


' interpretation. For Audi this has nothing to do with ideological commitments of 


the sort that Jauch, in my view correctly, stresses. Audi’s argument is simply this. 
Quantum mechanics contains classical mechanics as a limiting case, and indeed 
presupposes the classical description of measurement apparatus for its very 
formulation, but for this limiting correspondence to be possible key terms such 
as ‘particle’, ‘position’ and ‘momentum’, must have the same meaning in OM 
and in classical mechanics, otherwise we would have to regard the two theories 
as incommensurable in the style of Feyerabend, in which case one theory could 
never be subsumed under the other. But if ‘particle’ has the same meaning in 
QM as in classical mechanics then since possession of simultaneous position and 
momentum by an item is logically necessary for labelling that item a classical 
particle, which in turn implies description of its motion in terms of a continuous 
path, we must conclude that the particles referred to in the quantum-mechanical 
theory must also be regarded as possessing in reality simultaneous position and 
momentum and as moving in continuous paths, albeit we cannot uniquely 
predict what particular continuous path would result from a particular causal 
disturbance. The argument is I believe entirely fallacious. What we actually 
require for the existence of the correspondence limit is that the predicate 
‘particle-like in a certain specified respect’ with a different predicate for each 
respect—localisability or carrier of specific momentum for example—must be 
meaning-invariant. In certain limiting situations the extensions of these pre- 
dicates may coalesce. This is what we would call a classical particle but to argue 
from the meaning-invariance of the ‘partical’ predicates to the coalescence of 
their extensions in all situations is to make a fundamental confusion between the 
notions of the intension and the extension of a predicate. 

Since the nature of the classical correspondence limit is such a basic issue in 
the philosophy of QM it is of considerable interest to contrast Jauch’s position 
on this. Jauch’s presentation of his ideas on this subject in the book under review 
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is not as clear as one might wish, and since I believe them to be essentially 
correct I will attempt a brief clarification of the position as I see it. The problem 
is basically this. According to QM there is no reason why macroscopic objects 
such as tables, beloved of philosophers, or human beings or planets or whatever 
should not exist in ‘fuzzy’ superpositions of states. But although these states are 
not forbidden, as a matter of contingent fact they do not seem to occur. The usual 
‘naive’ approach to explaining the non-occurrence of these unacceptable super- 
position states for macroscopic objects is to invoke initial conditions. What we 
then have to show, for consistency, is that if a macroscopic object is initially in 
an acceptable state it will not in course of time evolve into an unacceptable state. 
At this point there usually follows some discussion of diffusion of wave packets, 
For example to put in some figures, one can easily show that if a mass of ro gms 
had its initial position specified to within ro~” cm, then after ro”? years the size 
of the wave packet would have increased to only about ro~ em. By contrast an: 
electron, even if confined initially only to within a centimetre, would spread in 
ro! years over a distance of the order of a light-ye&r. The explanation afforded 
here is similar to that used to rule out higher-dimensional permutational 
symmetries in the theory of identical particles. Here again we impose an initial 
condition plus the fact that one irreducible representation can never turn into 
another if the wave function develops in time according to a symmetric 
Hamiltonian. Implicitly the same argument can be used to show for example’ 
that the gravitational constant in Newtonian celestial mechanics has the observed 
value Gobs and not some other value. Essentially this can be expressed as the 
conjunction of two claims, G does not vary with time and the initial value of G 
is Gots. The point I am making here is that the method of arguing via initial 
conditions for the non-appearance of some physical state, which the underlying 
theory does not itself rule out, is widely used in physics. This type of explanation 
does however seem somewhat arbitrary and indeed smacks of an ad hoc 
manoeuvre. But in the case of the classical limit of QM we actually run into an 
immediate difficulty. While it may be true that a macrosystem will not of itself 
evolve from an acceptable state into an unacceptable state, we must also question 
whether the interaction between a macrosystem and a microsystem may not lead 
to an unacceptable state for the former. This problem is particularly pressing in 
that such an interaction must be presupposed in any quantum theory of measure- 
ment, and it is just here that the classically describable behaviour of the apparatus 
is apparently a cornerstone of the Copenhagen interpretation of QM. Un- 
fortunately the answer to this question is that measurement interactions do 
apparently in general lead to unacceptable superposition states for the apparatus, 
or more strictly for the microsystem-cum-apparatus considered as a whole. So 
in a sense it appears that the apparatus does not behave in every respect (in- 
cluding correlation effects between microsystem and macroscopic apparatus) in 
a classical way. But this circumstance may actually be a clue as to how we should 
proceed. Is it really the case that we want a theory that tells us that macroscopic 
objects behave in every respect classically? In fact it should be clear that if we 
look for the quantum we shall find it. Aftér all if we observe the properties of a 
particular atom or molecule as a constituent of the macrosystem then in respect 
of such a closely conducted inspection we cértainly would expect to reveal 
quantal behaviour. Furthermore there are many macro-phenomena which 
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express in amplified form quantal aspects of matter. Thus macroscopic quantal 
states, corresponding to long-range coherence effects occur in phenomena of 
superconductivity, Josephson effect, superfluid properties of liquid Helium, etc. 
If we were actually able to prove that all macroscopic systems behaved in all 
respects in a classical way we should be proving much more than we want, and 
actually would eliminate some of the successful explanatory power of QM as in 
the coherence phenomena already cited. Furthermore, as Audi correctly em- 
phasises, Bohr never denied that QM is applicable to macroscopic systems and 
in his famous analyses of thought experiments the argument always turns on the 
inherent limitations implied by the uncertainty relations as applied to macro- 
scopic slits, shutters, etc. Having regard to all these circumstances the view 
advocated by Jauch seems reasonable, namely it is only with respect to a limited 
class of observables, what we may call classical observables, that macrosystems 
always appear to exhibit classical behaviour. Jauch has elsewhere! developed this 
approach by introducing equivalence classes of microstates which are indis- 
tinguishable with respect to the set of classical observables, and such that every 
unacceptable state that a macrosystem may come to find itself in, for example as 
a result of a measurement interaction, is equivalent to some acceptable state. 
Specifically unacceptable superpositions turn out to be equivalent to acceptable 
mixtures. Mixed states of macroscopic objects are of course allowed classically, 
“and are the subject-matter of classical statistical mechanics. Jauch has not himself 
followed the constructive approach to classical observables propounded by von 
Neumann, van Kampen and others, but regards the existence of Abelian sets of 
such observables as a necessary and sufficient condition for the possibility of 
conducting measurements on microsystems. In fact this claim by Jauch appears 
to be exaggerated. With regard to sufficiency it appears necessary to supplement 
Jauch’s conditions by a criterion of irreversible approach to thermodynamic 
equilibrium for the amplification stage of a measurement process. The reason 
for this is that Jauch’s theory turns out to be inconsistent with the quantal time 
evolution of the equivalence classes of microstates, that is to stay states in the 
same equivalence class at one time may evolve into different equivalence classes 
at a later time. It appears then that macroscopic systems can only be ‘objectified’ 
in Jauch’s sense when thermodynamic equilibrium has been achieved. The 
problem of the objectification of objects in non-equilibrium states remains. With 
regard to Jauch’s claim of necessity, it is quite possible to develop a theory of 
non-Abelian sets of observables for distinguishing equivalence classes of micro- 
states which when combined with a Bohm-type theory of measurement inter- 
actions can produce objectification even in non-equilibrium states. Such develop- 
ments, with which the present reviewer has been recently engaged, suggest that 
Jauch’s approach, while substantially correct, needs some elaboration before the 
puzzling nature of the classical limit of QM can be regarded as solved. 


4 As I have already stated Audi turns, in the final chapter of his bodk, to the 
discussion of relativistic quantum field theory, ROFT, and purports to show 
that the vaunted unification of wave and particle pictures that ROFT is widely 
thought to provide is due to deep misunderstanding. For Audi there are wave 
theories and particle theories and never the twain shall meet. In particular 


1 See his [1964] dr his [1968]. 
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electromagnetic waves are to be sharply distinguished from matter waves, and 
the photon, the putative particle of light, is not to be regarded as a particle at all. 

Let me begin by explaining the orthodox view and then consider Audi’s 
criticisms. The modern approach to the quantum theory of radiation was 
developed in a remarkable paper of Dirac ([1927]). By the time Dirac wrote this 
paper it was clear that there were two apparently quite different ways of treating 
problems about electromagnetic radiation. Thus Debye ([1910]) had derived 
Planck’s law for the distribution of energy among different wavelengths in black- 
body radiation by quantising the motion of the oscillators represented by the 
normal modes of the radiation field. By contrast Bose ([1924]) had derived the 
same law by regarding radiation as a ‘gas’ of photons subject to a quantum 
statistics. Neither Debye nor Bose discussed the detailed dynamical processes 
involved in the emission and absorption of radiation. The general framework for 
such a discussion was provided by Einstein ([1917]), who introduced the famous 
A and B coefficients associated with spontaneous and stimulated emission. 
Dirac’s [1927] paper achieved two things: (1) if gave a theoretical basis for 
calculating the Einstein coefficients, (2) it clarified the relationship between the 
approaches of Debye and Bose to radiation problems. According to Dirac one 
could treat the radiation field in two distinct ways and arrive at the same final 
result, a quantised field, which united and transcended the wave and particle 
pictures of light. One way of arriving at a quantised field was to start with a 
classical field (in this case governed by Maxwell’s equations) and subject it to a 
quantisation procedure. This is the so-called method of field quantisation. The 
second method was to start with a theory of an assembly of classical particles, 
subject it to a first quantisation leading to a many-particle Schrödinger equation 
and then reformulate the resulting theory by means of an operator formalism, 
usually referred to as second quantisation, so as to yield a description in terms of 
a quantised field. In Dirac’s own words his results achieved ‘a complete harmony 
between the wave and light-quantum descriptions of the interaction [between 
atoms and electromagnetic waves]’. Dirac’s ideas were immediately extended so 
as to apply to wave-particle duality for material particles by Jordan and Klein 
([t927]) and Jordan and Wigner ([1927]). These results suggested a remarkable 
parallel between the problems of wave-particle duality for material particles and 
for electromagnetic radiation, a point of view which was whole-heartedly adopted 
by Heisenberg and Pauli ([1929]) in their relativistic formulation of quantum 
electrodynamics, Audi wants to deny the analogy except at a purely formal level. 
To begin with Audi points to the classical limit for the two types of theory—on 
the one hand a field theory (Maxwell’s equations) and on the other hand a 
classical particle theory. Once again Audi wants to argue, via meaning-invariance, 
from the classical limit to the quantal description. We have already criticised this 
move as illegitimate, and anyway Audi finds himself in difficulties when dis- 
cussing material boson particles, such as m mesons. Here he wants to say that the 
classical limit is classical particle mechanics. In fact the classical limit in this case 
is a classical force field, associated with the strong binding forces between 
nucleons. Audi comments that the precise relationship between pions and intra- 
nuclear force fields is ‘beyond the scope of the present study’. I suppose this is 
one way of avoiding an objection! i 

A point which Audi regards as decisive in his discussion is that photons are 
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not localisable. Since this point is often made in the literature it is worth 
examining what this claim actually amounts to. The problem of the localisability 
of an elementary particle can be discussed in the context of first or second 
quantisation. In a first-quantised relativistic theory one can attempt to construct 
a position operator whpse eigenfunctions satisfy appropriate general conditions 
first stated by Newton and Wigner in their [1949], which express essentially 
invariance properties imposed by the homogeneity and isotropy of space on the 
compatibility of position measurements at different locations. For massless 
particles of spin > 1, which would include the photon with spin one, these 
authors were unable to construct a position operator satisfying their conditions. 
In his [1959] Fronsdal showed that if one did not require covariance with respect 
to rotations a position operator could be constructed for the photon. Wightman 
([r962]) extended the work of Newton and Wigner by applying Mackey’s theory 
of imprimitive representations of the Euclidean group. Wightman agreed that 
the photon could not be localised, while for the neutrino localisability depended 
on whether or not the neutrinb possessed a definite helicity. (It would be interest- 
ing to know whether Audi would class the neutrino as a particle.) Jauch and Piron 
in their [1967] sought to relax the Newton—Wigner—Wightman criteria for 
localised wave functions in a way less open to objection than Fronsdal’s [1959], 
namely by supposing that compatibility of localisation measurements is not 
necessarily valid for overlapping space domains. They concluded that photons 
were in this sense weakly localisable. It is clear from this brief review of some of 
the literature that the statement that the photon is not localisable requires careful 
discussion of what exactly is meant by localisability. However it is important to 
notice that even if we succeed in constructing a position operator in the first- 
quantised theory as for example in the case of massive particles such as the 
electron, there is no guarantee that exact localisation experiments corresponding 
to such an operator could be performed in view of limitations imposed when we 
look at the second-quantised version of the theory which allows for the possibility 
of particle creation and annihilation. It then transpires that if we attempt to 
localise a particle of rest mass m to within a radius smaller than its Compton 
wavelength h/mc, then the associated uncertainty in the energy of the particle 
will be sufficient to materialise additional particles in accordance with the 
Einstein E = me formula. So the localisation of a single particle in RQOFT is 
always circumvented by the creation of additional particles. In this sense the 
electron cannot be exactly localised any more than the photon. Of course there 
are differences between photons and electrons (in addition to their different 
statistics, which derives from their different spins). But the special properties of 
photons are due to their vanishing rest mass, which in turn is associated with the 
special invariance properties of massless fields, conformal invariance and gauge 
invariance. In terms of Wigner’s fundamental analysis of elementary relativistic 
systems, according to irreducible representations of the Poincaré group,! the 
difference between massless particles like the photon and massive partiéles like 
the electron reveals itself in the distinction between a non-compact little group 
isomorphic to the group of Euclidean’ motions in a plane associated with the 
former and a compact little group isomorphic to the three-dimensional rotation 
group associated with the latter. These are the issues that Audi should address ' 


1 See his [1939]. 
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himself to in discussing the differences between photons and electrons, rather 
than the ontological question of wave versus particle which Audi takes to be the 
key distinction. 

‘As an example of the muddled physics in this chapter, Audi argues as another 
disanalogy between photons and electrons that when phatons are absorbed their 
energy is transferred back to the electromagnetic field, whereas in the annihilation 
of electron-positron pairs the energy is not transferred to the Dirac field. These 
remarks overlook the photoelectric effect and the radiationless annihilation of 
positrons via Auger transitions. 


5 There are many other topics discussed in the books by Jauch and Audi, 
which I have not space to consider, ranging from Jauch on the psychoanalytic 
origin of concepts to Audi on quantum logic and the Duane—Landé account of 
electron diffraction. I have concentrated on those issues which I consider of 
major importance in the philosophy of OM. 

. The two books are quite different in style. Jauch is a physicist indulging in 
some fairly lightweight philosophical speculation. The book is beautifully written 
and can be recommended on that account alone. Audi, by contrast, is much more 
ambitious philosophically speaking. The style is brash and extremely polemical 
but I think that many of the detailed arguments are wrong, as I have tried to 
indicate. I cannot recommend the book as an important contribution to our 
understanding of quantum mechanics. 


M. L. G. REDHEAD 
Chelsea College 
University of London 
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KNOWLEDGE OF LANGUAGE* 


David Cooper’s book Knowledge of Language is concerned largely with what he 
calls the ‘knowledge of language thesis’ (or KLT for short). According to KLT, 
the user of some language has ‘tacit’ (or unconscious) knowledge of the rules of 
the grammar of that language. Furthermore, according to KLT, it is this know- 
ledge which (partially) enables the user to use his language in the way in which 
he does. As so formulated, KLT is a thesis which (until recently at least?) has 
repeatedly been advanced by Chomsky, Katz, and others. Thus, for instance, 
Chomsky has stated that ‘every speaker of a language has mastered and inter- 
nalised a generative grammar that expresses his knowledge of his language’ ,? 
while Katz has spoken of the user’s ‘internalised knowledge of the grammar’ of 
his language.’ 

Cooper (like Graves et al.,4 Fodor,5 Moravesik,® Arbini’ and others) sees KLT 
as important primarily in relation to another thesis also characteristically 
advanced by linguists and philosophers who have been influenced by Chomsky. 
We might call this thesis ‘Chomsky’s epistemological thesis’ (or CET for short). 
According to CET, research in linguistics is relevant to the problem of how 


* Review of David Cooper, Knowledge of Language. Priam Press. £1.95. Pp. 196. 

+ Chomsky has recently withdrawn to a more modest position in which he claims merely 
that the language user ‘cognizes’ the grammar of his language. See his [1976], p. 164. 

a Chomsky [1965], p. 8. ? Katz [1971], p. 131. 
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learning occurs and, in fact, supports what might fairly be called a ‘rationalist’ 
account of the learning process, as opposed to what might be called an ‘empiricist’ 
one.' Relative to the relation between this thesis and KLT, Cooper claims that 
CET ‘stems, primarily, froma certain view [i.e. KLT'| about what it is to know 
a language’ and adds that ‘to the extent that the relevant views of knowing a 
language are discredited, then to that extent the main prop is taken from beneath’ 
CET. On this view, then, the validity of CET is seen as presupposing that of 
KLT. 

That KLT and CET are of necessity related in this way depends on a particular 
view (a) of what was at issue in the traditional debate between empiricists and 
rationalists and (b) of how faithful to the original a modern reconstruction of that 
debate would have to be in order to be relevant to it. Re (a), I claim that Cooper 
(despite a mild demurrer?) sees the traditional debate primarily in epistemic 
terms-—t.e. as a debate about the existence of innate knowledge.* Re (b), I claim 
that Cooper takes the view that ‘there is little of philosophical contention in 
Chomsky’s doctrine’ about language learning unless it can be shown that 
research in linguistics has some more or less direct bearing on some quintes- 
sentially ‘philosophical’ doctrine about learning.’ 

Thus, my first criticisms of Cooper’s thesis: 

First, I doubt (rather more strongly than Cooper does) that the question of the 
existence of innate knowledge was the sole (and perhaps even the primary) issué 
between traditional rationalists and empiricists. Thus, I am inclined to agree 
with Chomsky, when he implies that the traditional debate can be reconstructed 
{at least partially) solely in terms of the conflicting claims made by rationalists 
and empiricists about the Ainds of innate mental faculties which each were willing 
to countenance.® That the issue can be framed in these terms has, perhaps, been 
obscured by the fact that both empiricists and rationalists did countenance some 
kinds of innate mental faculties.” Nevertheless, close inspection reveals, I think, 
that empiricists and rationalists consistently postulated the existence of quite 
different kinds of such faculties. 

Thus, for instance, to take what is, I think, only the clearest example of their 
difference in this respect: Empiricists consistently denied and rationalists just as 
consistently asserted the existence of innate mental faculties whose operations 
are such as to lead to the association of logically independent ideas prior to 
extended experience of their connection. See, for instance, Locke’s Essay (IV, iii, 
10), Hume’s Treatise (I, iii, 1), and Mill’s Logic (I, viii, 4) for the empiricist 
position on this matter, and compare these passages with, for instance, Descartes’ 
Optics (vi), Leibniz’s New Essays (IV, iii, 8), and Whewell’s Philosophy of the 
Inductive Sciences (1, IT, vi). 

But, of course, if it is the case that traditional rationalist and empiricist 
* Chomsky [1967], p. 9. * [1975], pp. 5-6. 

2 [1975], p. 4, where he notes that ‘I do not think that Chomsky’s views would lose all 
philoséphical fascination were we to opt for speaking only of predispositions and the like.’ 

4 See especially Cooper [1972], p. 483. 

5 [1972], p. 466. . $ [1976], p. 216, 

7 That the debate about CET has been muddied by a failure to realise that empiricists 
and rationalists can be distinguished in this way can be seen in the insistence of many 
commentators on CET that the traditional debdte must be understood in epistemic 
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examples of this attitude. 
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accounts of learning can be compared independently of their claims about the 
existence of innate knowledge, then it is the case that the validity of CET does: 
not presuppose that of KLT, since, for instance, CET might be supported by 
empirical research confirming the existence of innate mental faculties of a kind 
countenanced by rationalists but not by empiricists. By giving this possibility 
insufficient consideration, Cooper has vitiated his thesis that CET stands or falls 
with KLT. ` 

Second, ignoring the possibility of a pointless quibble on the meaning of the 
word ‘philosophical’, I deny what I take to be Cooper’s claim re (b) above—viz. 
that empirical theses (e.g. about how learning occurs) are more or less irrelevant 
to the probity of so-called ‘philosophical’ theses about roughly the same subject 
matter. There are two points to be made here: 

In this particular case, I am inclined to agree with Chomsky when he notes 
‘that it is a mistake to read Descartes, the minor Cartesians, Hume and others as 
if they accepted some modern distinction between “scientific” and “philosophical” 
concerns’, and adds that these“ philosophers’ advanced many detailed substantive 
(i.e. empirical) theses about how learning occurs.! If this is this case, then to say 
(as Cooper does) that there ‘is little of philosophical contention in Chomsky’s 
doctrine’ is not to say that it is irrelevant to the traditional debate between 
empiricists and rationalists. For this debate, though couched in more or less 
‘philosophical’ language was, on this view, at least partly one about certain 
questions of fact. Thus, facts about language learning uncovered by research in 
linguistics may be relevant to it. 

More generally, I agree with Watkins? that even quintessentially philosophical 
theses can and do have certain logical relations with quintessentially empirical 
ones. Thus, for instance, the ‘metaphysical’ all-some statement M that every 
event has a cause is logically inconsistent with the empirical statement E which 
asserts the existence of a particular uncaused event. In virtue of this relation, 
Watkins has pointed out that metaphysical theses can ‘influence’ the develop- 
ment of scientific theories. For instance, a scientist who subscribed to M could 
not also consistently subscribe to E and, thus, could be expected to abstain 
from constructing any theory T from which E followed. 

But, if this point is correct, then it may be that current theories of language 
learning are inconsistent (in this sense) with certain traditional (perhaps 
‘philosophical’) accounts of learning, and thus bear on their probity. T'o say this 
is not of course to say that this inconsistency would amount to a ‘refutation’ of 
such accounts. For (a) if a truly philosophical (i.e. ‘syntactically metaphysical’) 
account of learning does exist, then it is, for syntactic reasons, irrefutable; and 
(b) the empirical theory with which it is inconsistent may itself be false. Never- 
theless, to say that some current empirical (say Chomskian) theory of language 
acquisition is inconsistent with some traditional philosophical (say Humean) 
account of learning is to say that we (currently) have (fallible) grounds for 
supposing that the philosophical account is incorrect and, thus, for supposing 
CET to be confirmed. 

For these two reasons, then, I rejeċt what I take to be one of the primary 
motivations for Cooper’s entire approach—viz. that the prima facie irrelevance 
of empirical to philosophical theses forces us to consfruct some ‘philosophically 
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worded’ account (ie. KLT) of a fundamentally empirical thesis in order to 
demonstrate the relevance of that thesis to certain philosophical questions (i.e. 
the debate between empiricists and rationalists). Of course, if this view’ is in- 
correct, then objections to KET have no conclusive stgnificance for the probity 
of CET and, once sgin, Cooper’s claim that CET stands or falls with KLT is 
vitiated. 

Furthermore, in view of the possible independence vf KLT and CET, much 
of the motivation for Cooper’s exercise (i.e. the undermining of KLT) is called 
into question. Nevertheless, even in such a case, it would still be reasonable to 
consider the force of Cooper’s objections to KLT. For KLT itself is an interesting 
and important thesis—if only, for instance, because its validity implies the 
inadequacy of Ryle’s' familiar dichotomisation of ‘knowledge’ into ‘knowing 
how’ and ‘knowing that’. I thus turn now to a brief consideration of Cooper’s 
criticisms of KLT. 

If, following Popper,? we regard ‘essentialism’ as involving a misplaced interest 
in the ‘essential’ meanings of words, then we can, I think, criticise some of 
Cooper’s arguments against KLT as essentialist. Thus, his arguments in 
chapters 3 and 4 both involve claiming (a) that the word ‘knowledge’ has some 
paradigmatic or essential meaning; (b) that any extension of the use of this word 
to cover events not essentially denoted by it must be justified in terms of certain 
extenuating circumstances; (c) that such extensions of usage cannot be justified 
in the case of purported knowledge of the grammar of a language; and, thus, 
{d) that KET is false. 

Quite aside from particular objections (of other sorts) which one might make 
to Cooper’s specific arguments to this effect, I claim that such an essentialist 
strategy for undermining KLT is wrong-headed and misplaced. For I agree with 
Popper? that the important thing about any thesis, whether empirical or philo- 
sophical, is its relation to the problem it is intended to solve. In particular, I 
claim that it is of little interest whether the knowledge of grammar postulated 
by KLT is of a sort which conforms to some essential definition of the word 
‘knowledge’, as long as it 1s necessary to postulate some such ‘knowledge’ in order to 
solve some explanatory problem. In fact, the strongest argument for KLT known 
to me takes the form of arguing that precisely this is the case—i.e. that it is 
necessary to postulate knowledge of grammar in order to explain certain facts 
about the language user. Perhaps we can attribute Cooper’s failure even to 
consider this argument seriously to his acceptance of an essentialist strategy.‘ In 
any event, his failure to do so is clearly a major failing in his presentation. This 
is all the more ironic because this argument does not in fact succeed in establish- 
ing support for KLT. 

It is appropriate here for me to reveal my own hand: Like Cooper I believe 
that KLT is unsupported. Unlike Cooper I believe that this is the case not 
because it is somehow conceptually inappropriate (relative to an essential 
definition of ‘knowledge’) to postulate knowledge of grammars, but because it is 
unnecessary to do so (relative to the range of explanatory problems usually dealt 
1 Ryle draws this distinction in his [1949]. 

2 See his [1974], Section 7. 3 Ibid. 
* For confirmation of this view, see his [1975], p. 61, where he mentions this argument 


only to dismiss its relevance in the absence of any reason to suppose that it is coherent 
to speak of knowledge of grammar in*this way at all. 
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with). Furthermore, where for Cooper (and many of those who support KLT and 
whom he opposes!) a commitment to the falsity of KLT implies a commitment 
to the falsity of CET, for me these two theses are happily quite independent (in 
view of my previous renfarks about the relation*hetween them). Thus, I can 
reject KLT without being committed to rejecting CET along with it. 

Both because of the intrinsic interest of the argurgent for KLT just mentioned 
and in order to fill a lactina left by Cooper’s failure to consider it, I shall now 
reconstruct that argument and then say why I think it fails to support KLT. 
Notice that throughout my discussion the issues in question in no way concern 
the essential meaning of the word ‘knowledge’ but only the purported explanatory 
power of KLT. 

The argument for KLT to be considered is due, in its final form, to Graves 
et al.,* though it was earlier outlined by Fodor,? Moravesik,* Arbini® and others. 
The basic strategy behind this argument is the following: (a) More or less 
straightforward empirical considerations support the empirical hypothesis that 
we should attribute to the language user a certain mechanism M whose operations 
are such as to figure in the user’s displays of his ‘competence’ to use his language. 
(b) Nevertheless, there are things which the language user is able to do which 
cannot be explained solely by the attribution to him of M. (c) These abilities of 
the language user can, however, be explained on the assumption that he has 
‘tacit knowledge’ of the grammar of his language. (d) Thus, attribution to the 
user of such knowledge has excess explanatory power over the attribution to him 
of M. (e) Thus, we are justified in supposing that the language user has tacit 
knowledge of the grammar of his language—i.e. KLT is supported. 

Specifically, an argument based on this strategy takes the following form: 

(¢) Language users are capable of asserting a (potentially infinite) set of 
propositions about their language. Thus, for instance, in a whole range of cases 
language users will, when queried, typically assert that a given string of words 
either is or is not an ‘acceptable’ sentence of their language. It is claimed, by 
Graves et al., that these assertions are explicit manifestations of instances of the 
language user’s knowledge about his language. That is, it is claimed that ‘observa- 
tion statements’ reporting these assertions should take the form P: ‘User U 
knows proposition p’. 

(#) If P is taken as explanandum for which some theory is to be offered as 
explanans, then it is a logical requirement for the adequacy of that theory that it 
contain epistemic predicates, since P itself does. 

(iii) A theory T, which attributes M to U, does not contain epistemic predicates 
and, thus, fails to satisfy the logical requirements for the adequacy of explanations 
of P. 

(iv) A theory T”, which attributes knowledge of the grammar G to U, does 
contain epistemic predicates and so does not fail to satisfy the logical require- 
ments for the adequacy of explanations of P. (In particular, an explanation of P 
in terms of T” might be as follows: U knows G; G implies p; thus, U knows p; 
thus, P.) 

(v) Thus, T’ has excess explanatory power over T. Thus, it is necessary to 


1 For this attitude among supporters of CET, see Graves et al. [1973]. 
3 Op. cit. : * Op. cit. 
t Op. cit. ° Op. cit. 
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attribute knowledge of the grammar G to U in order to explain U’s ability to 
assert propositions p. Thus, KLT is supported. 

What is wrong with this argument? I claim that this argument is invalid 
because it omits mention of the role which an ‘observational theory’ Tọ plays 
both in the decision to describe what U does when he asserts p as a manifestation 
of his knowledge of p and in the explanation of the ‘observation statement’ P 
which results from taking such a decision. In short, the physical event e by means 
of which U asserts p does not come labelled as it were as a manifestation of his 
knowledge of p. In other words, our reasons for using P to describe e are (a) that 
there is a theory T'o according to which, if e (under some low level description £) 
meets certain conditions C, then e may be described by P; and (b) that e meets C. 
But, in this case, there.may be an explanation of P, involving T, which is 
logically adequate, but which does not involve attributing knowledge of G to U. 
` Specifically, such an explanation might be as follows: Since M is constructed in 
such a.way that, under the conditions in question, its operations are such as to 
lead to the physical event e; then T implies E. Furthermore, Tọ and E may 
together imply P. Thus, T and T o, neither of which attributes knowledge of G to U, 
together imply P. Thus, T” has no excess explanatory power over the conjunction 
of T and To. Thus, it is not necessary to attribute knowledge of G to U in order 
to explain his assertions about his language. Thus, KLT is not supported. Here, 


then, I agree with Cooper, though obviously for different reasons and even, I, 


might say, for different kinds of reasons. 

KLT was intended to provide a general explanation of various aspects of the 
language user’s competence to use his language. It was intended, that is, to 
provide an account of those structures and processes which underlie and eventuate 
in manifestations of that competence. If, as both Cooper and I claim, KIT is 
unsupported, then the question arises of how we are otherwise to account for the 
language user’s competence. 

Cooper seems content to point out that the user’s competence is a disposition 
to behave in certain ways under certain broadly specifiable circumstances, and to 
note that an adequate grammar of his language describes this disposition. .With 
this view I am in more or less complete agreement. I find Chomsky’s many 
arguments to the effect that the user’s competence is not ‘merely’ some set of 
dispositions more or less unconvincing—essentially for the reasons given by 
Cooper in chapter 7 of his book. 

Where Cooper and I part company is over his rejection of the idea that a 
grammar can somehow be construed as a psychological theory.1 He seems to 
think that accepting this idea commits one more or less automatically to accepting 
KLT. There is, moreover, some justification for his thinking so, since a familiar 
non sequitur in the linguistic literature is that which argues to KLT from the 
need to postulate certain mental events and states as underlying and eventuating 
in manifestations of competence.* Nevertheless, these positions are distinct. For 
it seems to me perfectly possible to acknowledge the necessity of postulating such 
states and processes without also admitting that they need involve knowledge of 
some grammar. In particular, it is perfectly possible to see a grammar as a 


* See his [1975], p. 87. i 
3 For examples of this non sequstur, see Katz [1971], pp. 130-1, Slobin [1974], pp. 67, 
Moravcsik [1969], pp. 416-17, ete. $ 
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psychological theory (that is, to claim that a grammar describes mental states and 
processes) without claiming that it is known to the user whose mental states and 
processes it describes. 

And, if this possibility ¢s a real one, then there is a third position intermediate 
between Cooper’s austere ‘dispositional account’ (¢.e. that grammars describe 
dispositions) and Chofnsky’s more venturesome KL7-based account (i.e. that 
the user’s competence depends on his knowledge of fhe grammar of his language). 
That is, it is possible to interpret grammars ‘realistically’ (i.e. as describing mental 
events and processes underlying manifestations of competence). Notice that this 
third account is fully explanatory—ie. like the KL7-based account, it tells us 
on what basis manifestations of competence occur—without in any way involving 
the dubious claims inherent in the KLT-based account. Thus, I reject Cooper’s 
‘anti-realist’ attitude toward theories of linguistic competence without at the 
same time embracing the KL7-based account which we both reject. 

In conclusion, I should say that many of the criticisms I have made of Caoper’s 
presentation of the issues involved in current linguistic theorising can just as 
easily be directed against his opponents, In point of fact, the entire debate about 
these issues has rested, I think, on certain shared but nonetheless unsupported 
presuppositions about the range of alternatives open in attempting to construct 
explanatory theories of complex linguistic phenomena. Cooper’s book un- 
fortunately shares many of these presuppositions and so, ultimately, fails as a 
constructive critique of current positions—in particular, because it fails to advance 
the debate beyond the circumscribed set of positions implicitly defined by those 
presuppositions. It is, for all that, a thought-provoking and, in its detailed 
argumentation, challenging work. 


F. B. D'AGOSTINO 
London School of Economics and 
Massachusetts Institute of Technology 
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Reviews 


Laxatos, I. [x976]: Proofs and Refutations: The Logic of Mathematical Discovery. 
(Edited by John Worrall and Elie Zahar.) London: Cambridge University 
Press, 1976. £7.50 cloth, £1.95 paperback. Pp. xii X 174. 


This posthumous volume is a supplemented reissue of the similarly entitled 
essay that appeared in four parts in The British Journal for the Philosophy of 

Science in 1963-4. 
` The hundred pages of that remarkable essay ring changes on a single geo- 
metrical theme: Euler’s law that the faces and vertices of a polyhedron together 
outnumber the edges by two. After explaining the classical proof, Lakatos 
produces an exception: a hollow solid whose surfaces are a cube within a cube, 
Its faces and vertices outnumber its edges by four. Then he examines the classical 
proof to see how it falls foul of such examples, and what stipulations would be 
suitable for excluding them. Having thus narrowed the scope of Euler’s law, he 
produces a further exception: a solid consisting of two tetrahedra with only an, 
edge or vertex in common. A further tightening of the law is thus indicated, and 
still the exceptions are not at an end. A polyhedron with a square tunnel through 
it occasions a further restriction; a cube with a penthouse on top occasions yet a 
further restriction; and so the dialectic of revision and exception goes its 
oscillating way. 

The geometry is fascinating, but the purpose is philosophical. Lakatos is 
opposing the formalists’ conception of mathematical proofs, which represents 
them as effectively testable and, once tested, incontrovertible. He is opposing the 
notion, so central to logical positivism, that mathematics and natural science are 
methodologically unlike. 

In respect of sprightliness the style contrasts markedly with the subject 
matter. The text is a spirited dialogue among a teacher and sixteen pupils. The 
footnotes, nearly as voluminous as the text, furnish historical precedents for 
ideas and attitudes expressed in the dialogue. Lengthiness of footnotes is in most 
writings a sign of poor organisation: failure to worry one’s material coherently 
into one-dimensional prose. But not so here, where the two levels of the page 
distinguish systematically between the fictitious participants of the dialogue and 
real mathematicians of the past three centuries. The lower level is fully as 
rewarding as the upper, the invidious distinction between nine- and twelve-point 
type notwithstanding. The wealth of scholarship is overwhelming. 

The successive amputations from Euler’s law are cunningly contrived and 
ordered, so as not to disqualify too many counter-instances at any one time. A 
major amputation early in the treatment would mean losing most of the illustra- 
tions of the methodology of theory construction, as well as spoiling the fun. At 
points this prolonging of curtailments is strained. Thus one counter-instance that 
Lakatos exploits is Kepler’s star polyhedron, viewed as bounded by only twelve 
faces each of which is a star with an empty pentagonal centre. The more natural 
view of it, as bounded by sixty triangular faces, is allowed to emerge only later. 
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The excuse for the first or dodecahedral account of the star polyhedron is that 
the five triangles composing each hollow star all lie in one plane, as befits a face. 
Yet in the fullness of time even the planarity of faces conveniently lapses, when a 
cylinder, of all things, is accepted as a three-faced- polyhedron. 

This studied prolonging of curtailments serves not merely to prolong an 
iconoclastic holiday. It affords instructively varied illustrations of the dialectic 
of conjecture and refutation which Lakatos, following Popper, recognises as the 
logic of scientific discovery. A mathematical inquiry begins, he says, with a 
conjecture and, by way of tentative proof, a thought-experiment which de- 
composes the conjecture into subconjectures. Counter-instances to the original 
conjecture emerge; the proof is then re-examined in search of a guilty sub- 
conjecture, which at length is convicted in turn by a counter-instance. A new, 
tighter subconjecture is substituted; ‘counterexamples are turned into new - 
examples—new fields of inquiry open up.’ And, as he well illustrates, the 
process can repeat itself. i 

The volume includes three supplements to the original essay, all edited from 
Lakatos’s Cambridge Ph.D. thesis of 1961. One of these added pieces continues 
the dialogue on faces, edges, and vertices of polyhedra, concentrating now on 
Poincaré’s proof of Euler’s law by methods of vector algebra. It is not easy. A 
second supplement brings a different case study: no longer of Euler’s law, but 

`of a law noted by Leibniz, and professedly proved by Cauchy, to the effect that 
the limit of a convergent series of continuous functions is a continuous function. 
The final supplement follows up with further case studies, and at this point the 
concern is with pedagogical values. Lakatos does not in the end deny the 
feasibility of full formalistic rigour in mathematical proof, but he makes an 
eloquent and conclusive case for preferring the heuristic style of conjecture and 
refutation in mathematical treatises and textbooks. 


W. V. QUINE 
Harvard University 


D. J. O’Connor [1975]: The Correspondence Theory of Truth, Hutchinson 
University Library. Pp. 144. £2.75. 

C. J. F. WiuLiams [1976]: What Is Truth?, Cambridge University Press. Pp. 102. 
£4.90. 


O’Connor begins with a survey of ‘the difficulties in the way of making a clear 
and consistent philosophical theory out of a commonsense conviction—that true 
beliefs and statements correspond to facts’ (p. 128). He goes on to examine the 
relevance to the correspondence theory of Tarski’s semantic definition of truth 
and of the Austin/Strawson controversy; then in a final chapter he summarises 
what he takes to be the salvageable content of the correspondence theory. 

In part I, O’Connor concentrates qn the familiar problems of explaining the 
terms of, and the exact character of, the correspondence relation. After reviewing 
some difficulties about rival candidates—sentences, statements, propositions, 
eternal sentences—he concludes that it is beliefs,-which may or may not be 
verbally expressed, which are the primary bearers of truth and falsity. He goes 
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on to distinguish ‘truth of expression’, which is the linguistic correctness of a 
speaker’s expression of his belief, and ‘truth of cognition’, which is the corre- 
spondence of a belief to a fact; and between ‘weak truth’, or ‘descriptive 
adequacy’ of an expression not actually affirmed of the situation it describes, 
and ‘strong truth’, which also requires affirmation (pp. 87-8). 

In the final chapter (p: 13q) O’Connor distinguishes: z 


A. status rerum (‘the raw unexperienced welter of dbjeds and events’) 
B. things and their properties, situations, events 
C. empirical statements. 


Of these, B is supposed to be a processed version, edited by sensation, percep- 
tion, memory etc., of A; C to be a processed version, edited by semantic con- 
. ventions, of B. Truth links A and C, and O’Connor takes whatever is of value in 
the correspondence theory to lie in the hypothesis that ‘some of the structural 
features of status rerum can be transmitted to us 7 conceptual and linguistic 
form’ (p. 131). 

The value of Tarski’s work, O’Connor thinks, liesi in its contribution to our 
understanding of the link between B and C, but does not bear directly on the 
problem of empirical truth (p. 111). A moral he draws from Strawson’s criticisms 
of Austin’s version of the correspondence theory is that facts are ‘on the wrong 
side of the semantic fence’ to be the second term of the correspondence relation ` 
(p. 120), for which one requires, rather, the raw unconceptualised data represented 
by A. 

It is really not very clear how the minimal theory offered in the last chapter 
relates to the discussion in part I—the relation of the distinction between 
expressive and cognitive truth to the distinction between levels A, B and C can 
be indirect at best, and the distinction between strong and weak truth seems 
entirely irrelevant to it. These are symptoms of a pervasive weakness of the 
book; the intended direction of the argument is frequently obscure. I think this 
is connected with the self-imposed limits on O’Connor’s enterprise: he offers 
neither a thorough critical examination of traditional correspondence theories, 
nor a detailed presentation of an alternative version; it is not altogether surprising 
that the result is a somewhat undisciplined book. 

Furthermore, the standard of argument is very uneven; too often, O’Connor, 
relies on dubious and unsubstantiated claims like: ‘Beliefs seem to have a certain 
priority ...as it is prima facie a necessary condition of a statement or other 
symbolic belief having a truth-value that the belief itself should have one’ (p. 28); 
‘,.. if one and the same sentence can take sometimes different truth-values and 
sometimes no truth-value, it is clearly useless to look to sentences as our truth- 
bearers’ (p. 36); ‘a sentence used simply as a grammatical example can hardly be 
said to be true or false’ (p. 48). Also dismaying is O’Connor’s penchant for 
drawing distinctions with scant regard for their pertinence. For instance, on 
page 28‘he distinguishes between beliefs as states of mind and beliefs as the 
objects to which those states of mind are directed, and indicates that it is beliefs 
in the second sense which are the truthbearers; on page 29 he distinguishes 
between beliefs as dispositions and beliefs as occurrences, and indicates that it is 
beliefs as occurrences with which he is concerned. But since the second distinc- 
tion applies to the first of the original pair of senses—a distinction among beliefs 
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as acts*—it is quite irrelevant if O’Connor is concerned with beliefs as contents. 
(It may be because of this confusion that O’Connor fails even to consider the 
possibility of identifying beliefs as contents with propositions.) And too often 
O’Connor seems to accept an objection and then-goes on to make a claim which 
seems vulnerable to it; for example, on page 41 he apparently accepts Geach’s 
objection to the thesis that only asserted sentences can be true or false, but on 
page 81 he introduces thé distinction of strong and weak truth, which seems to have 
equally awkward consequences for the intelligibility of the truth-conditions of, 
say, conditional or disjunctive assertions. - 

O’Connor’s book is neither comprehensive enough, nor clear enough, to-be a 
really good introduction to the traditional correspondence theory of truth, but it 
is too sketchy, and too weakly argued, to amount to an important original 
contribution to that theory. This is a pity; because O’Connor’s exposition of , 
‘Tarski’s theory, which manages to be quite accurate and clear while avoiding 
technicalities, indicates that he could have written a much more useful book. 
(But even here, I am afraid, there is a lapse when, on pp. 109-10, he accuses 
Tarski of blurring the line. between logical and factual truth; the truths of the 
class calculus to which Tarski’s definition of truth is originally applied are 
presumably logical truths, but Tarski’s definition does not, as o’ mannoi seems 
to fear, make truth and logical. truth indistinguishable.) 

E Williams regards the correspondence account as hardly AT AT view 
of. the highly problematic character of the correspondence relation—to a theory 
(p. 96). He offers, instead, what may not too misleadingly be described as‘a 
descendant of Ramsey’s redundancy theory, and a cousin.of Prior and Mackie? 8 
simple theory, and Grover et al.’s ‘prosentential’ theory of truth. 

His initial analysis is this: ‘What Percy says is true’ comes to ‘For some FA 
both Percy says that p, and p’. This preliminary account is then explained and 
refined by way of consideration of various difficulties and objections. It has been 
suspected that a sentence like ‘For some p, both Percy says that p, and p’ must 
be ungrammatical unless supplemented by a predicate to form a complete 
sentence, and that the suitable predicate must be ‘is true-—which would make 
the analysis circular. Williams denies that the final ‘p’ need be elliptical, and 
explains ‘For some p... p’ as true if (not iff) a true sentence can be obtained by 
„omitting ‘For some p’ and putting a sentence at each remaining occurrence of 
‘P’, as in “Percy says that Mabel has measles, and Mabel has measles’. Williams 
does not regard ‘true’ as simply redundant; it is eliminable from ‘it is true that 
Caesar was murdered’, but not from ‘What Percy says is true’, where it is needed 
to make a complete proposition out of ‘What Percy says’. However, ‘What Percy 
says’ is not, according to Williams, a genuine referring expression; it is, rather, 
an incomplete symbol, no more a name than ‘What the postman brought’ in 
‘What the postman brought is on the mantelpiece’. The analogy leads to a more 
elaborate analysis, now designed to indicate uniqueness; ‘For some p, for every q, 
both the proposition that p is the’same as the proposition that q iff Percy says 
that g, and p’. (One might justifiably have some doubts not only about whether 
Percy’s having said just one thing is really a necessary condition of the truth of 
‘What Percy says is true’, but also about the feasibility of the project of in- 
dividuating the things people say; but I shall not linger over such doubts here.) 
Indeed, ‘What Percy says’, Williams thinks, is even less name-like than ‘What 
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the postman brought’, ince i in the analysis ‘p’ is a sentence and not an individual 
variable. 

However, although he adopts a Russellian account of ‘What Percy says’, 
Williams rejects the Russellian view that ‘What Percy says is true’ entails ‘Percy: 
says something’; rather, adapting Geach’s account of sentences the utterance of 
which amounts to the assertion of more than one proposition, or asserting a 
proposition and asking a question, etc., he takes ‘What Percy says is true’ to 
presuppose that Percy says something. Assertion of “What Percy says is true’. is 
thus now analysed as the joint assertion of ‘Percy says something’ and ‘For all p, 
if Percy says that p, p’. What Percy says is false’ is not, according to Williams, 
the contradictory of ‘What Percy says is true’; rather, an assertion of ‘What 
Percy says is false’ is the joint assertion that Percy says just one thing, and that 

. for some p, both Percy says that p, and it is not the case that p. 

A final chapter investigates whether anything remains of a correspondence 
character. “True’, Williams argues, is not really relational, not even in the sense 
that a ‘quantificational’ predicate like ‘married’ is. But the persistent feeling that 
truth is relational is explicable, he suggests, since the favoured analysis is con- 
junctive; the similarities between really relational predicates and functions of 
two arguments may have led to the mistaken belief that truth is relational. 
(I confess that I find this far-fetched.) xs 

A theory of truth in the style Williams offers has considerable attractions. For ' 
example, if truth is not a property or a relation, one is no longer obliged to 
explain what it is a property of, or a relation between; and so one is spared the 
aridity of the traditional disputes about truth-bearers. And the simplicity of 
theories in this style is another recommendation. 

However, the theory also involves some problems, and one’s assessment of 
Williams’s book must be guided by one’s judgment as to how much he has con- 
tributed to their clarification and solution. Williams is aware of the need for‘an 
adequate account of sentential quantifiers, for example, and in this respect his 
discussion is an advance on that to be found in Mackie [1973]; though he does 
not make it altogether clear that the difficulty for his theory lies in the assumption 
that sentential variables must share the term-like character-of the individual 
variables of predicate calculus, and consequently I found his discussion of 
sentential quantifiers less illuminating than Grover’s in [1972]. Some vital 
questions are not discussed at all; chief among these I would count the semantic 
paradoxes, with which any acceptable theory of truth will be obliged to cope. I 
am not suggesting that Williams’s theory is incapable of dealing with the 
paradoxes; indeed, I think it shows promise of improving our understanding of 
them—for instance, by making it explicable how a Liar-type paradox can be 
generated using propositional quantifiers and negation even without the pre- 
dicates ‘true’ and ‘false’. I am suggesting, however, that Williams’s total neglect 
of this issue is a serious omission. 

Again, a reader who, like me, was brought up in the tradition of truth theories 
in Tarski’s style, is apt to feel a certain giddiness when deprived, as in Williams’s 
theory, of the object language/metalanguage distinction; and this raises some 
questions to which I would have liked to have had Williams’s answer: what, on 
his theory, is going on when one gives truth-tables? (one usually thinks of a 
truth-table as indicating what truth-yalue a compound sentence will have given 
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the truth-values of its components, but this explanation stems to depend on the 
idea that truth and falsity are properties); can his theory distinguish between the 
law of excluded middle (‘p v -p’) and the principle of bivalence (Tp v T‘-p’’), 
or must it regard this as a distinction without a difference? and must the theory 
be bivalent? (T'he last question is of special significance for Williams’s version of 
the theory, in view of his adoption of a presuppositional account of descriptions 
such as ‘What Percy says’.) 

A more substantial book, which gave a more comprehensive account of how 
such a theory would deal with the problems I have mentioned, would be most 
welcome. As it is, I think, Grover, Belnap and Camp’s paper [1975] advances 
the debate about the feasibility of a redundancy-style truth theory further than 
the small book Williams has given us. 


SUSAN HAACK 
University of Warwick 
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This book is a collection of ten papers, six of which are based on articles published 
between 1967 and 1971 (some in journals philosophers might not frequently 
read), The other four are lectures read in 1969 and 1970 and not previously 
published. The papers seem to be selected for the diverse illumination they give 
to the central question of scientific knowledge of fact. Bunge asserts in the 
introduction that ‘the disjointedness of topics is more apparent than real’ (p. v) 
and that the leit mottv of the book is that the search for understanding of matters 
of fact ‘should be dominated by the scientific method’ (p. v). 

To an extent Bunge is right about the lett motiv, and on the face of it this should 
not disturb philosophers of science. Nonetheless this is a disturbing book. Given 
what I know of the views of science common among scientists and philosophers 
there will be many readers challenged somewhere or other in this book. Mostly I 
found myself cheering Bunge on as he tilted at foes we have in common, using 
mathematical, logical and semantical weapons (tools) with far more skill than I 
could muster and exposing a breadth and depth of knowledge in science that few, 
if any, other philosophers could match. Sword or ploughshare, he is mostly 
successful. I shall mention later where he seemed less successful, stressing, by 
way of constructive criticism of an already very constructive book, some matters 
on which Bunge’s view differs from mine. 

That this constructive book is disturbing may be explained by the extent to 
which Bunge’s leit motiv exceeds the mere advertisement of the fruitfulness of 
scientific method in the search for under&tanding, including the more general 
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philosophical search for knowledge. It is hard to describe the method of science 
without reference to what are misleadingly called the presuppositions of scjence 
(including ontology) and to criteria for the preference of one hypothesis over 
another—which leads into aesthetics (or more generally axiology) and into 
epistemology. The old fashioned positivists tried it: they dismissed ontology and 
aesthetics, and conflated epistemology and methodology, giving philosophers of 
science the sole task of the logical reconstruction of scientific knowledge. None- 
theless, and despite Wittgenstein’s attempt to reduce the philosophical task, 
where it went beyond logic, to the dissolution of linguistic muddles, the 
positivists had to face several somewhat intractable problems, including: how 
to distinguish science from non-science of various kinds (problem of demarca- 
tion); and how to justify universal hypotheses in science, given reports of 
+ experience (problem of induction). Assuming the ‘phenomenalism’ of the Vienna 
Circle, these problems are insoluble, as Popper pointed out at the start, and as 
the members of the Circle discovered. (On other assumptions, the problems get 
reformulated, and in some reformulated versions they appear soluble.) Despite 
the failure to solve these problems in their phenomenalist formulation, the early 
proposals for their solution—only those statements are scientific which may be 
confirmed (or refuted) in experience; and laws in science are more or less well 
probabilified by favourable observation reports—are still widely endorsed by 
scientists and philosophers. Bunge constantly returns to the theme of the un- ` 
suitability of phenomenalism as an interpretative schema for science, especially 
for physics. Thus, I would say that Bunge’s leit motiv is that his version of 
scientific method should dominate the search for truth. This point will be taken 
up as I sketch the contents of the book chapter by chapter, indicating where I 
agree with Bunge and where I have reservations. 


(1) ‘On method in the Philosophy of Science.’ Here is a biting attack upon 
much that passes for philosophy of science in the English-speaking world. In a 
nutshell, Bunge is uncompromising in his demand that for philosophers of 
science to deserve that appellation they should concern themselves with genuine 
philosophical problems which genuinely arise in genuine science. For him, 
philosophy of science must prove its worth by being tested against science, 
rather than as an independent enterprise. Without naming names, Bunge 
strongly attacks (i) what he calls ‘apriorism’—the approach to science which 
decides, in advance of examining actual science, that it is in accord with some 
particular philosophy: ‘it is far easier to discuss philosophical ideas that tradition 
associates with science than to handle philosophically a real piece of science’ 
(p. 2); and (#) what he calls ‘preface analysis’, e.g. the use of arguments from 
authority based on philosophical remarks of great scientists with which they 
preface their work. Text book analysis—the use of textbooks as reliable sources 
of living science—comes in for faint praise. Bunge warns us that scientists 
usually obscure the methodology of science by the manner of presenting its 
results. This holds not only for textbooks, but even for research reports, it seems. 
What is a poor philosopher to do? Here I think Bunge is somewhat tough on 
philosophers who have not done original work in science. Nonetheless his warning 
is salutary that scientists commonly give a poor account of their own methods. 
The moral however need not be that all philosophers of science should first be 
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scientists; it could be to look more closely at what scientists do, rather than to 
join them. Even so, I think Bunge is right to suggest that philosophers of science 
need to be at home in some branch of science, whatever science is, if they are to 
be relevant to it. Bunge’gives slight applause to ‘historico-philosophical analysis 
as complementary to the systematic philosophical analysis he favours; and slight 
applause to analysis of bits of scientific theories, though he warns that it is easy 
to misinterpret, e.g. the’ equation linking mass and energy in relativity theory, or 
the uncertainty relations in quantum theory, if they are taken out of the context 
of their theoretical systems. 

The method Bunge favours is systematic analysis and reconstruction of 
‘chunks’ of ‘real’ science, using the tools of logic, mathematics and semantics. 
If this were done, we should be spared ‘non-problems’ such as the grue paradox; 
and the discussion of statistical explanation would be deepened. Incidentally, . 
despite Bunge’s attempt to dismiss the grue paradox by talk of the embeddedness 
of scientific predicates in theoretical systems, there is no system I know of which 
unequivocally accounts for the colour of emeralds (it is due to one or other of a 
number of impurities, no doubt). Ironically, some samples of beryllium alumino- 
silicate change from green to blue on being heated, and it does not seem to be 
known in advance which specimens would do so. Thus some beryl is true grue. 
Furthermore, I am by no means convinced by his probabilistic argument that 

* the grueness of emeralds has zero probability: it reminds me of Hume’s use of 
induction to dismiss miracles even after he had dismissed induction. 

On the whole, however, I agree with Bunge that there is a distinct difference 
between what he advocates, ‘namely to understand the presuppositions means, 
products, and targets of scientific research in the light of philosophy’ (p. 1), and 
what many philosophers of science are up to. And I prefer what he advocates: 
there seems more point to it. But I think this may be a matter of taste, unless 
Bunge wants to argue that it is misleading to call the products of those who 
succumb to the dangers of ‘amateurism’, ‘fashionableness’, ‘artificiality’, ‘hollow 
exactness’, and ‘scholasticism’ by the label ‘philosophy of science’. Then to an 
extent he may be right. He is certainly right that the kind of activity he advocates 
is of more use to science. 


(2) ‘Testability Today.’ Here Bunge aims to look at the differing criteria of 
demarcation offered by the Vienna Circle and by Popper which are often 
conflated into the formula ‘that alone is scientific which is testable’ and to offer 
an up-dated version of that conflated criterion, in the light of studies of more 
general theories remote from test. Chapter 8, ‘Is scientific metaphysics possible?” 
should be read in conjunction with chapter 2. In chapter 8, Bunge characterises 
a kind of theory which is most general, systematic, exact (explicitly using logic 
or mathematics), compatible with science, elucidatory of key concepts in 
philosophy or in the foundations of science, and capable of occurring among the 
presuppositions of science. He claims that this kind of theory ought to’ be called 
metaphysical, gives three examples, and concludes that what he calls scientific 
metaphysics exists. Ergo, the demar¢ation the positivists drew is false and ‘the 
debate over the exact position and nature of the demarcation line between 
science and metaphysics belongs to the history of philosophy’ (p. 41). Nonethe- 
less, Bunge wants to continue to demarcatescience from anything else (including 
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‘plain metaphysics’) (p. 146) and to do so with an enlarged testability criterion, 
which would allow his kind of metaphysics to count as science, and take into 
account the remoteness from test of most of the best theories in physics. ° 
Bunge’s analysis (elsewhere*as well as in this book) of different kinds of 
theories and of how they meet experience is incomparably good. Chapter 2 is 
supplemented i in this regard by chapter 4, ‘The Axiomatic Method in Physics’, 
in which Bunge shows by analysis of axiomatised theoties the inadequacies of 
formalism, operationalism and what he calls the ‘double vocabulary’ version of 
empiricism, to account for actual sclence; chapter 5, ‘Concepts of Model’, in 
which model objects and theoretical models, which have use in science are 
distinguished from pictorial representation, analogy with familiar object, and 
interpretations of formal systems in the model-theoretic sense, which do not 
. play a part in theoretical science; and chapter 6, ‘Analogy, Simulation, Repre- 
sentation’, in which Bunge analyses these concepts and contributes greatly to the 
possibility of clearer use of them in future. I have no quarrel with Bunge’s theory 
of theories. In chapter 7, ‘Mathematical Modeling [sic] in Social Science’, Bunge 
advertises the potential fruitfulness of the application of mathematical modelling 
to sociology (the economists hardly need convincing!). Here again the argument 
is clear and the suggestions constructive. I enter a caveat however about the idea 
of natural laws in sociology or the human sciences generally to which J shall return. 
To return to the problem of demarcation, let me say I find Bunge’s discussion - 
of the old criteria and of their need for replacement shallow and unconvincing. 
Not that I hold any brief for any of the various criteria proposed by the philo- 
sophers in or associated with the Vienna Circle, including the criteria in which 
they embraced what they fancied was Popper’s falsifiability criterion for dis- 
tinguishing science from pseudo-science. (Popper never intended to distinguish 
meaningful from meaningless statements, nor science from metaphysics). Nor, 
for that matter do I hold any brief for Popper’s criterion if used to distinguish 
science generally from non-science generally (as my criticisms in the Schlipp 
volume show). Indeed, I think the problem of demarcation itself is rather 
shallow, though I think Popper’s attempted solution of it is deep, exciting and 
revolutionary. Despite this I am somewhat distressed that Bunge repeats the old 
errors of supposing that what Popper proposed was merely the logical matter of 
being empirically refutable as ‘the seal of science’. All Bunge’s criticisms of this 
parody of Popper’s criterion were anticipated by Popper in 1935 (and before) 
which is why Popper proposed, in addition, the policy of trying to refute theories 
rather than trying to rescue them from defeat. Furthermore, Popper never 
suggested that theories we had decided to regard as refuted should be rejected 
before something better was on hand to replace them. Nor did he ever suggest, 
as far as I know, that there was a sharp demarcation line between science and 
metaphysics. Hence Bunge’s improvements are improvements merely on the 
older positivist criteria. Popper’s criterion is quite untouched by Bunge’s argu- 
ments. This would not matter very much except that there are grave difficulties 
with Bunge’s ‘enlarged testability criterion’ which Popper’s 1935 criterion does 
not raise, and that Bunge’s criterion was presented deliberately (in a Popper 
symposium, originally) to improve on Poppet. I dislike scholasticism as much as 
Bunge does, I fancy, but correct exegesis of a view one ascribes to a particular 
person is at least a matter of courtesy. Beyond the matter of courtesy, it would 
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not matter that Bunge replaces merely a view Popper never held, if in the replace- 
ment he improved upon Popper’s view, as it were by accident. But he does not. 
Bunge’s expanded testability criterion comprises two components: confirmability 
(even if only indirect) ; and compatibility with the bulk of our background scientific 
knowledge. Refutability is relegated to the position of being necessary merely for 
‘optimal empirical testability’ (p. 31). First let me sey why I think the problem of 
demarcation is shallow, ånd then why I think Bunge’s solution is less acceptable 
than Popper’s even though Popper’s will not do. 

What is the point in trying to solve the ptoblem of demarcation? I cannot take 
seriously the idea of being interested in the ‘real’ meaning of the word ‘science’, 
since I adopt Humpty Dumpty’s view of the meanings of words plus the caveat 
that rational discussion will be at an end unless we pay attention to the expecta- 
tions hearers will have as to our intended meaning when we use a word. Hence for . 
rational discussion we need to pay attention to the going convention regarding a 
word’s meaning. I do take seriously the problem of characterising methods 
thought good for this or that purpose—in the case of science and philosophy 
methods thought good for the pursuit of truth. And I do take seriously the 
complementary task of pointing out deficiencies in methods mistakenly thought 
by some to be conducive to their ends. But I find it hard to believe that there will 
be one and only one proper comprehensive method for the pursuit of truth such 

‘that we could lay down severally necessary and jointly sufficient conditions (the 
logicians’ pet demand) for the pursuit of truth, and tick off whether this or that 
person, group, institution, or whatever, was engaged at this or that time in that 
pursuit. My guess is that there may be a number of features generally held in 
common. 

It seems to me that there are two common ulterior motives in seeking a 
demarcation. The first is in order to refuse the honorific accollade of being 
scientific to enterprises we dislike, despise, think mischievous etc. The second is 
in order to refuse research grants to enterprises we think a waste of time. Both 
these are political matters rather than philosophical. The most philosophical 
juice ‘I can squeeze out of the problem of demarcation relates to the task of 
characterising methods conducive to the search for truth. 

Popper’s solution to the arid problem is a brilliant contribution to the juicy 
task of saying how to pursue truth: propose bold conjectures to solve the problems 
posed by experience or by previous conjectures or by both together; criticise 
these conjectures as rigorously as you can, not aiming to hang on to any in the 
teeth of criticism, but aiming to improve them. It needs to be stressed that the 
criticism required is a matter of social tradition which can make room for pig- 
headed individuals as well as cranks or crackpots. By contrast with this intellectu- 
ally exciting and socially hospitable recipe, Bunge’s prescriptions are dull and 
inhospitable. Furthermore Bunge’s criterion for a piece of work or an item of 
proposed knowledge to be scientific is conservative almost to the point of 
circularity: that only can be reckoned as science which is compatible with what 
is already reckoned as science. Two problems: (ï) suppose we could agree on 
what is already science and on what would count as something’s being compatible 
with that (a very tall order), would not our errors in science be protected from 
elimination by audacious new ideas? (fi) who is to be taken as authoritative in 
deciding what to reckon as already science and what to count as compatible? 
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This last question can hardly be answered with ‘Those already reckoned as 
scientists’. “Who is already a scientist?’ is at issue in “What is science?’ 


(3) ‘Is Biology Methodologically Unique?’ Here Bunge considers ten argu- 
ments that go to suggest, not that biology studies different kinds of physical 
systems by different techniques—that is taken for granted—but that the general 
method of biology differs from that of physics and chemistry. Bunge pejoratively 
calls the genus of the opinions he here criticises ‘methodological vitalism’ 
although only two of the arguments he considers even look like vitalistic argu- 
ments: the first, that biology asks teleonomic questions; and the eighth that 
there is a holism appropriate in biology which is inappropriate anywhere else. 
Indeed several of the arguments seemed more phenomenalistic than vitalistic: 
. the fifth, that laws in biology are statistical rather than dynamic (surely vitalism 
presupposes dynamism); and the tenth, that biology need not invoke 
unobsérvables. 

In my view, Bunge was busy attacking all and any arguments he had heard of 
which might help protect vitalism even of a methodological variety, and in 
general J think his refutations of the claims to methodological uniqueness were 
successful. However, presenting his arguments as he did, as an attack on vitalism, 
confused matters, since his rebuttals contained even more devastating attacks 
upon mechanism of the old inertial kind, and upon reductionism inertial or’ 
dynamic; and since the point about many of the arguments he rebutted was that 
they were designed to protect biology from mechanism and reductionism. 
Bunge’s message, especially if this chapter is read in conjunction with chapter 9 
“The Metaphysics, Epistemology, and Methodology of Levels’ and chapter 10, 
‘How Do Realism, Materialism and Dialectics fare in Contemporary Science?” 
is that reductionism is false and that biologists have nothing to fear from a 
materialism that acknowledges the partial autonomy of different levels of being 
and of analysis. I am not quite convinced of the truth of this latter claim and I 
shall return to that point below, where I shall argue that the concept of initiative 
in human affairs is rendered vacuous by Bunge’s strict materialism. 

As I have indicated already, I have no quarrel with chapters 4, 5 and 6 where 
I think Bunge is at his best, and only one caveat regarding chapter 7, which is of 
the same kind as my reservation just stated: I think the social sciences need to 
make more room for the concept of the morally autonomous human being. The 
very argument I shall use against Bunge on this point illustrates the only criticism. 
I have of chapter 8 (which otherwise does a splendid job of demolishing the old 
positivist criterion of verifiability) namely that contrary to Bunge’s view, plain 
metaphysics, which conducts its conversations without the aid of formal logic or 
semantics (though not, one would hope without logic in the broader sense) is 
relevant to science. Let me put it this way: for metaphysics to be intellectually 
respectable (as far as I judge) and for it in addition to be beneficial in its influence. 
on ratiorial inquiry after truth (including in its influence on science) it is not 
mandatory for it to employ either formal logic or formal semantics. While care 
is to be exercised in the choice of words‘and the construction of sentences (of 
course), debate in metaphysics, or more broadly i in philosophy, may usefully 
proceed in English, or French, or German. 

For my money, chapters 9 and 10,-which deal with the deeper ontology Bunge 
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favours are the most exciting chapters in the book. Bunge is one of the very few 
philosophers of science who try to come to grips with ontological issues. 

The pall of the positivist rejection of metaphysics still hangs heavily over much 
philosophy of science. Which is to say, ontological commitments sneak into 
philosophical work unacknowledged, unchecked, uncriticised and (hence) very 
likely immune from improvement. In some circles, it would be heretical even to 
say, as I have just done, that they existed. Incidentally, talk about ‘the pre- 
suppositions of science’ promises ambiguity and confusion since not science but 
the individual scientist has presuppositions, and these vary from person to 
person. Ditto, the goals of research. True, we can speak of the presuppositions of 
a theory, meaning those assumptions without which the theory does not make 
sense: but one reason for ontological disputes in philosophy of science is that, 
apparently, there are not any necessary presuppositions for science, in this , 
narrow sense of ‘presupposition’ though there may be many presuppositions in 
a broader sense, which scientists have in common. People seem to be able to 
contribute to the growth of knowledge in science from the most diverse sets of 
assumptions. Hence I prefer to discuss metaphysics in its own right and in- 
vestigate independently whether this or that metaphysical presupposition might 
tend to inhibit or to encourage scientific inquiry as well as whether or not it is 
true. Certainly I do not wish to settle ontological disputes by appeal to the 
‘explicit predilections of eminent scientists or of the majority of scientists. Let 
me now try to set out Bunge’s ontology briefly, before taking up the single point 
of disagreement. 

First realism: there are things in themselves which are knowable by approxima- 
tion through theorising and experiment, all such knowledge being hypothetical, 
hence corrigible, indirect and symbolic. With this thesis I am in complete 
agreement. 

Secondly, pluralism of levels: reality is a level structure, with every existent 
belonging to at least one level, and every level having some autonomy and 
stability, and its own peculiar properties and laws. Hence, over time some levels 
are ‘emergent’ with respect to others, and some things (systems) ‘evolve’ from 
others. Every event is primarily determined in accordance with the laws of its 
own and of contiguous levels. In short, qualitative variety is irreducible. Again, 
complete agreement. 

Thirdly, dynamism (here I make use of other writings of Bunge, especially his 
discussions of causality and of the propensity interpretation of probability, to set 
the view out more fully): every existent is in the process of change through 
dynamic (forceful, productive) interaction either internally or externally. Some 
changes involve the ‘emergence’ of new qualities and new laws. Again, complete 
agreement. 

Fourthly, materialism: every existent is a material system, the outcome of 
lawful change (self-assembly or decomposition) in other material systems. 
Organisms are material systems; the mind is an activity of the centrat nervous 
system, society is a system of organisms. Here there is only qualified agreement. 

I am afraid that in taking up ontological questions in connection with science, 
and in agreeing on realism, pluralism and dynamism, Bunge and I are so far 
removed from what is commonly talked about in philosophy of science for any 
dispute over materialism to appear a mere family quarrel. But it is more than 
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this because, ironically, the point at which I disagree with Bunge is precisely the 
point, in all this ontology, where he would have most scientists and philosophers 
of science on his side. i 

Let me put my difference with Bunge this wa: Kant distinguished a 
noumenal from a phenomenal domain; and, while he restricted scientific inquiry 
to the phenomenal domain, eo that the objects of that inquiry were never to be 
regarded as things in themselves, he located the morally autonomous, rational 
self in the noumenal domain. Given this distinction of domains, philosophers 
may forever protect their belief in the self from any challenge posed by deter- 
minism assumed in the name of science. Now both Bunge and I reject Kant’s 
restriction of science to the phenomenal world: we both believe that scientific 
knowledge is conjecture aimed at characterising the real (noumenal) world. (Of 
course we agree with Kant that human beings do not directly, immediately 
perceive things in themselves: scientific knowledge is intended to construct both 
a theory of what there is and a theory of perception to explain why we perceive 
what we do.) Given this rejection, the protection once offered to the autonomous 
self is removed. I think we both agree that the autonomy of the self is severely 
hampered by the impact of internal and external physical constraints on the 
functioning of the brain—one does not have to be an identity theorist to accept 
those constraints as real. The question that remains is whether a thorough going 
materialism, which allows that the propriety of proposing natural laws is not 
restricted to the physical sciences but ranges also over the biological, psycho- 
logical and social sciences, is inimical to a theory of moral autonomy. Please note 
this is not the same as the question whether ontic determinism is inimical to 
moral autonomy, since true natural laws may be probabilistic. (Here I am taking 
Bunge’s and my agreement about level pluralism and about propensity inter- 
pretation of probability for granted). Please note also, that the problem does not 
arise if we hold merely a phenomenalist or instrumentalist interpretation of 
scientific theories, since under that interpretation theories are not either true or 
false, they are merely more or less felicitous abbreviations of accumulated data. 
(Do not ask what is the epistemological status of a datum!) The problem arises 
only for realists, in the sense indicated above. 

In a nutshell, the problem is whether room is left for human initiative in 
personal, social or political affairs, by a theory that accords to laws in the human 
sciences the same status as laws in the physical sciences. I shall here offer one 
argument to suggest that no such room is left, which is thus an argument for 
rejecting Bunge’s materialism. The view with which I wish to replace Bunge’s 
materialism is that in those sciences which span human affairs, the laws proposed 
can never have a stronger status than that of trend forecasts that presuppose no 
human initiative in social or political affairs. I argue that provided human beings 
were creatures of instinct and habit; provided that ideas (especially revolutionary 
ideas) played no role in choice of action sufficient to bring about deliberate trend- 
modifying actions; provided that there were no amplification effects subsequent 
upon human initiative (remember the humorous poem: ‘For want of a nail... 
the kingdom was lost’!), then laws describing trends in human behaviour (laws 
in economics, sociology, history) might properly be regarded as natural laws. 
Otherwise, we shall have to say that whatever the extent to which we expect that 
laws announced in the human scienges as accurately fitting all known facts in past 
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experience will hold accurately in the future, to that exterit we rule out human 
initiative based on the moral autonomy of the self. I do not mind ruling it out to 
some ‘extent, because I believe human beings are to some extent creatures of 
habit etc. I do however oBject to its being ruled ott completely. Hence I regard 
all laws in the social sciences (Marx’s theories of the direction of history, Western 
economists’ rival theories of political economy, the assumptions leading to the 
Club of Rome’s doomsda¥ findings) as warnings of what may befall unless human 
beings exercise initiative based on moral autonomy to modify or even reverse the 
announced trends. 

To return specifically to my reservations regarding chapters 3 and 7. I welcome 
any effort to make use, in biology and sociology, of elements of method found 
fruitful in physics and chemistry. However, while J do not wish to accord the 
status of having initiative, rooted in moral autonomy, to the objects of inquiry 
in physics and chemistry, nor to just any of the objects of inquiry of biology 
(animal lover though I am, I do not wish to assert, though I should not rule out, 
for other species, some parallel quality to human moral autonomy); I do wish to 
insist upon the moral autonomy of humans as a datum which undermines the 
natural law status of deterministic and stochastic (probabilistic) laws in biology 
and sociology in so far as those laws are pertinent to the prediction of human 
actions. Hence the presupposition of moral autonomy implies an element of 
difference between the methodology of the human sciences and the physical 
sciences which I should not want Bunge to suppress in his (correct) emphasis 
upon similarities. 

Finally, a comment on Bunge’s style: Bunge is so much opposed to 
scholasticism that he rarely takes time in his writings to indulge in careful 
exposition and detailed exegesis of the views he wishes to attack and to replace. 
This may be thought a pity, since his own views are worth so much attention yet, 
assuming some common human weaknesses and impatience similar to his own, 
his readers may give him tit-for-tat and more or less ignore his views. Rational 
discussion would really be at an end if we no longer had the time or patience to 
seek tò understand each other thoroughly before moving on to stating our own 
views. For example, although I hold no brief for dialectics at all, I cannot help 
suspecting that those who do would be left unmoved by Bunge’s demolition of 
their position (as summarised by himself) in chapter 10. By no means do I intend 
to imply that Bunge does not understand the views he criticises or that he 
commonly distorts them in précis (though I do think he occasionally errs); nor 
do I mean to imply that he ought to alter his style: there is surely room for a 
few creative thinkers in a hurry. Rather I make the point as a warning to readers 
more accustomed to leisurely argument: to benefit from the rich suggestiveness 
of this book, avoid being irritated by the impatience and briskness of its style! 


TOM SETTLE 
University of Guelph 
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Bootos, G. S. and Jérrrey, R. C. [1974]: Computability and Logic. Cambridge 
University Press. £4. Pp. X+262. 
‘This book is a text in mathentfatical logic at the advan¢ed undergraduate or first 
year graduate level. It covers a good number of topics, including the complete- 
ness of first-order logic, the Lowenheim-Skolem theorem, recursive and Turing 
computable functions, the undecidability of first-ordet logic, and Gédel’s first 
incompleteness theorem. In addition, there is a welcome account of some results 
which are not usually found in logic textbooks, e.g. Ldb’s theorem on provability 
predicates (and its application to the unprovability of consistency), second-order 
logic, non-standard models of arithmetic, arithmetic forcing, and the decid- 
ability of additive arithmetic. The authors’ style is pleasantly informal and both 
. proofs and explanations are very clear. All in all, the book provides an ex- 
cellent second course in mathematical logic, and can be warmly recommended to 
students. 
. JOHN BELL 
London School of Economics 
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Logicism Revisited* 
by ALAN MUSGRAVE ; 


Introduction ; 

Old-style Logicism and its Breakdown 

Is Set Theory a Branch of Logic? 

Logicism Rescued by the If-thenist Manoeuvre 
If-thenism 


wb WN OW 


I INTRODUCTION k 

I was led to revisit logicism by an historical riddle, so I will begin with that. 
Mathematics always played a key role in the philosophical battle between 
empiricists and rationalists or intellectualists. The empiricists always had 
trouble with mathematics: some (like Locke) said it consisted of ‘trifling’ . 
or ‘verbal’ propositions, others (like Mill) said it consisted of empirical 
truths (Hume vacillated between these two as regards geometry). Neither 
account seemed plausible. The intellectualists, on the other hand, derived 
their chief comfort and inspiration from mathematics. Anyone who denied 
that a priori reasoning could issue in genuine knowledge was met with 
the triumphant question ‘What about Euclid’s geometry”. Russell 
describes the situation well (in his [1897], p. 1): 

Geometry, throughout the 17th and 18th centuries, remained, in the war agginst 
empiricism, an impregnable fortress of the idealists. Those who held—as was 
generally held on the Continent—that certain knowledge, independent of 
experience, was possible about the real world, had only to point to Geometry: 
none but a madman, they said, would throw doubt on its validity, and none but 
a fool would deny its objective reference. The English Empiricists, in this matter, 
had, therefore, a somewhat difficult task; either they had to ignore the problem, 
or if, like Hume and Mill, they ventured on the assault, they were driven into 
the apparently paradoxical assertion that Geometry, at bottom, had no certainty 
of a different kind from that of Mechanics... 


Now the great achievement of modern empiricism, we are often told, is 
to have removed this old objection to empiricism. Modern empiricists 
have shown, it is said, that Locke was basically right: despite appearances, 
mathematics does consist of ‘trifling propositions’, or more precisely, of 


* An earlier version of this paper was read at a meeting of the British Society for the 
Philosophy of Science on 13 October 1975. I have benefited from comments and critic- 
isms, made at that meeting and elsewhere, by Peter Clark, Max Cresswell, Donald 
Gillies, Moshe Machover, Graham Oddie, Sir Karl Popper, Robert Stoothoff, Pavel 
Tichý, and John Worrall. 
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tautologies. The truths of mathematics are all logical truths. And the 
a priori status of these truths is no threat to empiricism because they are 
as empty of content a$ ‘Either snow is white or it isn’t’. The desperate 
solution of some early empiricists has now been shown to be the correct 
one by the detailed reduction of mathematics ‘to logic. Such is the philo- 
sophical importance of the logicist programme. No wonder our modern 
empiricists call themselves ‘logical empiricists’ to distinguish themselves 
from their forbears who could not take philosophical advantage of the 
reduction of mathematics to logic. 

Yet something is wrong with this empiricist success story. It is well- 
known that the programme of reducing mathematics to logic could not - 
be carried through, that it foundered on the logical paradoxes. And yet, 
decades later, the logical empiricists made the logicist thesis a cornerstone 
of their position. It looks like a prime example of Georg Cantor’s Law 
of the Conservation of Error: a thesis continues to lead a healthy life long 
after the programme in which it was embodied has passed away. And yet 
. logical empiricists did not ignore the difficulties which beset the logicist 
programme—indeed, they were better aware of them than most. This, 
then, is my historical riddle. 

My solution to the riddle is that the logicist thesis which survived into 
logical empiricism is a very different thesis from the original one. I will 
call this new thesis [f-thenism, to distinguish it from old-style logicism or 
logicism proper.? In this paper I will show how If-thenism rose from the 
ashes of old-style logicism, explain the difference between them, and 
ask whether If-thenism is an adequate philosophy of mathematics. 


2 OLD-STYLE LOGICISM AND ITS BREAKDOWN 


Old-style logicism was an incredibly bold thesis. Russell stated it as 
follows, in his ‘Mathematics and the Metaphysicians’ written in 1901 
(Russell [1917], pp. 75-6): 


It is common to start any branch of mathematics—for instance, Geometry—with 
a certain number of primitive ideas, supposed incapable of definition, and a 
certain number of primitive propositions or axioms, supposed incapable of proof. 
Now the fact is that, though there are indefinables and indemonstrables in every 
branch of applied mathematics, there are none in pure mathematics except such as 
belong to general logic... All pure mathematics—Arithmetic, Analysis, and 
Geometry—is built up by combinations of the primitive ideas of logic, and its 
propositions are deduced from the.general axioms of logic... And this is no 
longer a dream or an aspiration. On the contrary, over the greater and more 
difficult part of the domain of mathematics, it has been already accomplished; 
in the few remaining cases, there is no special difficulty, and it is now being 


1 The term was coined by Putnam who, as we will see, defends a version of the doctrine. 
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rapidly achieved. Philosophers have disputed for ages whether such deduction 
was possible; mathematicians have sat down and made the deduction. For the 
philosophers there is now nothing left but graceful acknowledgements. 


We can divide the thesis so confidently expounded here into three parts: 
(A) all so-called primitive’ notions of mathematics can be defined using 
only logical notions; 
(B) all so-called primitive propositions of mathematics can be deduced 
from logical axioms; 
(C) all theorems of mathematics can be deduced from its so-called primi- 
tive propositions (and hence, by virtue of (B), from logical axioms). 
` The logicist programme was the monumental effort of Frege, and then of 
Russell and Whitehead, to give a detailed demonstration of these three 
claims. i ; 
Of course, some of the groundwork for this ambitious programme had 
already been done. Analysis had been ‘arithmetized’, various algebraic 
theories had been axiomatised, the axiomatisations of geometrical systems 
had been improved, Cantor and Dedekind had developed set theory, and ` 
Dedekind and Peano had axiomatised arithmetic. But in none of this earlier 
works was the logic involved made fully explicit. The early logicists 
proposed to remedy this defect, and show that mathematical proofs could 
be formalised (hence their thesis (C)). And they also proposed to define 
Peano’s primitive arithmetical notions in logical terms (thesis (A)), and to 
deduce Peano’s arithmetical axioms from logical axioms (thesis (B)). The task 
remained a monumental one, despite the work of their predecessors. 
As is well-known, the logicist programme, and thesis (B) in particular, 
foundered upon the logical paradoxes. It turned out that one of the 
necessary axioms of logic, far from being a trivial logical truth, was logically 
false. The axiom in question, the (unrestricted) Axiom of Set Abstraction, ` 
states that there exists, for any property we describe via an open formula, 
a set of things which possess the property.1 From this Axiom we can 
easily derive Russell’s Paradox.2 Hence something was wrong with the 
proposed logicist foundation for mathematics, and it had to be revised. 
Frege responded by amending his Basic Law (V), his version of the 
1 Frege had expressed doubts about the ‘self-evidence’ of his version of this axiom: “A 
dispute can only arise, as far as I can see, with regard to my Basic Law concerning 
courses-6f-values (V), which logicians perhaps have not yet expressly enunciated, and 
yet is what people have in mind, for example where they speak of the extensions of 
concepts. I hold that it is a law of pure logic. In any event the place is pointed out where 
the decision must be made” (Frege [1964], pp. 3-4). 

2 Russell’s system is, of course, also subject to a version of his paradox which involves 
only predicates. This version, of the paradox shows the untenability of unrestricted 


quantification over predicate variablesp just as the set-theoretical version shows the 
untenability of unrestricted or naive set theory. 
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Axiom of Abstraction, to try to avoid Russell’s paradox. But the amended 
law avas also contradictory. Frege probably discovered this himself, and 
as a result was finally led to abandon the logicist programme.! 

Russell’s solution to the problem was his famous Theory of Types. 
The unrestricted Axiom of Abstraction was*renounced, thus avoiding 
Russell’s Paradox and (hopefully) any other paradoxes. Unfortunately 
Russell’s new logic, as well as preventing the deduction of paradoxes, also 
prevented the deduction of mathematics. Russell therefore supplemented 
it with some additional axioms, the Axioms of Infinity, Choice, and 
Reducibility, and he and Whitehead proceeded to show that the whole of 
classical mathematics could be obtained from the Theory of Types together , 
with these additional axioms (here, of course, I ignore Gédelian compli- 
cations). Showing this was, of course, a great achievement and one which, 
as Russell might say, the philosophers can but gracefully acknowledge. 

Zermelo’s solution to the problem was structurally similar. He too 
renounced the unrestricted Axiom of Set Abstraction, and proposed less 

_powerful axioms for set theory. The hope was that these new axioms 
would be powerful enough to yield mathematics, but not so powerful as 
to yield contradictions. Zermelo did not know whether his second hope 
had been fulfilled (and Gödel later showed that in a sense we cannot know 
this). But it was different with the first hope. It did turn out that the whole 
of classical mathematics could be reduced to Zermelo’s set theory: all 
mathematical notions were defined in terms of logical notions together 
with ‘e, the single primitive notion of set theory; and all true propositions 
of classical mathematics were derived from logical axioms together with 
the axioms of set theory (ignoring Gédelian complications once more). 
Again, a great achievement which the philosophers can but gracefully 
acknowledge. 

Philosophers might well ask, however, what has become of the major 
philosophical claim of the early logicists. Does the reduction of mathe- 
matics to set theory (or to a theory like Russell’s) establish that mathematics 
is a branch of logic? Clearly this will depend upon whether we count set 
theory (or Russell’s theory) a branch of logic. I now turn to this question. 


3 IS SET THEORY A BRANCH OF LOGIC? 

The question sounds dangerously verbal. One might merely stipulate that 

the term ‘logic’ is to cover set theory, and then pronounce the logicist 

thesis true. But this is to make logicism true by arbitrary stipulation, a 

method which (as Russell might remark) has all the advantages of theft over 

honest toil. If the assimilation of set theory to logic is to be more than 
1 On the failure of ‘Frege’s way out’ see Quine [1955]. 
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verbal, it must involve showing that (a) the primitive notion of set theory, 
ʻe, is a logical notion, and that (b) the axioms of set theory are logical 
truths. (Similar things would have to be shown for Russell’s theory.) 
Can either of these things be shown? 

"It is hard even to disctss the first question, whether ‘e is a logical 
notion, for the simple reason that we lack a convincing account of what it 
takes for a notion to be a logical one. Tradition has sanctified a few notions 
as logical ones: the usual connectives, the quantifiers, the ‘is’ of predication, 
and (though some dispute its inclusion) the ‘is’ of identity. These are the 
notions that figure essentially in the usual rules of inference—and fixed 
. Meanings are assigned to them by the usual semantical rules. But what is 
the rationale behind this traditional list? Why do logicians count ‘is’ 
logical and ‘eats’ non-logical? : 

Bolzano, that great pioneer of the foundations of logic, despaired-of an 
answer and relativised his definitions of logical truth, logical consequence, 
etc., to an arbitrary selection of terms to be counted logical. Now any true 
statement comes out logically true if we count all its terms logical and so 
are not allowed to vary the interpretation of any of them. And a hallowed ` 
syllogism such as Barbara will come out invalid if we count ‘are’ non- 
logical and interpret it to mean, say, ‘eat’. Bolzano found this quite 
acceptable: he could see no way of establishing that “All men are mortal” 
is not really a logical truth, or that Barbara is really valid.t Tarski drew 
attention to the problem in 1935, and concluded “no objective grounds 
are known to me which permit us to draw a sharp boundary between the 
two groups of terms” (Tarski [1956], pp. 418-19). Popper tried hard to 
solve the problem in the 1940s (see Popper [1947]), but has recently 
admitted that his solution does not work (see Popper [1974], p. 1096). 
Kemeny defines the logical notions as those whose meaning is fixed by 
the customary semantical rules (Kemeny [1956], part 1, p. 17), but admits 
that the rules are drawn up with a specific list of logical notions in mind 
and hence cannot provide a rationale for that list. 

In this situation the prospects of settling our first question in a non- 
arbitrary way do not seem bright. There are, however, two arguments 
on the question, one for classing ‘e’ logical and one against. But neither 
of them is very conclusive. 

The early logicists did not hesitate td class ‘e’ logical. And we can 
reconstruct the following argument for doing so. Since the Axiom of Set 
Abstraction is our sole existential axiom for sets, each of our sets is 
determined by an open formula. So we can eliminate ‘e’ wherever it occurs 
in favour of admittedly logical notions contained in the open formula. To 


1 On Bolzano see Kneale and Kneale [1962], pp. 365-71. 
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put it crudely, ‘e’ is merely an alternative notation for admittedly logical 
notions, principally the ‘is’ of predication. Or as Frege put it (Frege 
[1972], p. 32): . l 

I have replaced the expression ‘class’ [or, we might add, ‘member of a class’] 
which is often used by mathematicians, by the expréssion ‘concept’ [or, we might 
add, ‘falling under a concept’] which is customary in logic; and this is not merely 
an indifferent change of nomenclature, but is important for the knowledge of the 
true state of affairs. ' 


The argument is cogent enough, but it rests on the mistaken assumption 
that the Axiom of Abstraction is true. The discovery of the paradoxes 
undermined this argument. And in axiomatic set theory ‘e’ is counted a , 
non-logical or primitive mathematical notion. 

It remains the case, however, that in Russell’s Theory of Types ‘e is 
an explicitly defined notion (on each type level). If we count all notions 
in the definiens, and in particular the higher-order quantifiers, as logical, 
then presumably ‘e’ is to be counted logical also. Carnap and Hempel 

_do so, and triumphantly conclude that all mathematical notions are logical 
ones. Quine demurs, arguing that the “tendency to see set theory as logic 
has depended early and late on overestimating the kinship between 
membership and predication”. Predication is one thing, says Quine, but 
once we existentially quantify a predicate variable we assert the existence 
of an attribute, and via that attribute, of a set. He concludes that so-called 
higher-order predicate calculus is actually a “way of presenting set theory 
[which] gives it a deceptive resemblance to logic” (Quine [1970], pp. 66-8). 
Presumably, for Quine, logic stops at first-order logic, and the higher- 
ordet quantifiers and ‘e’ are not to be counted logical notions. Is this a 
mere prejudice in favour of first-order logic? This brings me to the 
argument against classing ‘e logical. 

The argument rests on Gédel’s results that first-order logic is complete 
while second-order logic is not. If we stick to the traditional list of logical 
notions, and define the notion of logical consequence accordingly, then 
all the logical consequences of a set of premises can be captured by syn- 
tactic methods. If, on the other hand, we extend the list of logical notions 
to include higher-order quantifiers, then the logical consequences of a 
set of premises can no longer be captured syntactically.2 Therefore, the 
argument runs, we should refuse to extend the title ‘logic’ to so-called 
higher-order logics. The argument contains a tacit and unargued assump- 
1 See Carnap [1942], section 13, pp. 57-8; Hempel [1945], p. 375- 

3 For an informal account of these results, see Henkin [1967]. I here ignore the so-called 
‘completeness theorem’ for higher-order logics: this ayises from the attempt to give a 


semantic characterisation of the syntactically provable sentences, and involves a dis- 
tortion of the notion of logical consequence. 
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tion: that logic must be confined to what can be captured syntactically. 

Hence it is not a conclusive argument. k 

So we see that it is not easy to settle the question of whether ‘e is a 
logical notion. Perhaps wé should accept the view, once expressed by 
Tarski, that this is a matter of taste: if you prefer to*work in the Theory 
of Types you will count ‘e’ logical, if you prefer to work in set theory you 
will not.1 I do not think that it really matters if we reach this rather sad 
conclusion, because it is our second question, whether the axioms of set 
theory are logical truths, which is the really crucial one. 

As will be apparent from this, I think our two questions are independent 
of each other, so that we could count ‘e’ logical without counting the 
axioms of set theory logical truths. Many philosophers would disagree. 
Hempel claims that because the Axiom of Infinity “is capable of expression 
in purely logical terms [it] may be treated as an additional postulate 
of logic” (Hempel [1945], p. 377). Kemeny claims that most logicians 
recognise the Axioms of Infinity and Choice “as legitimate logical prin- 
ciples” presumably for the same reason (Kemeny [1959], p. 21). This is a 
position which goes back to Wittgenstein’s Tractatus, and to Ramsey. i 

But there is a simple argument against it, which goes back to Russell. 
First, a logical truth may contain non-logical notions (consider “All men 
are men”), so that containing only logical notions is not a necessary 
condition for being a logical truth. Second, and more controversially, it 
is not a sufficient condition either: for we can express, using only admittedly 
logical notions, each of the mutually incompatible claims ‘“There is exactly 
one thing”, “There are exactly two things”, “There are exactly three 
things”, and so on; can it plausibly be maintained that one of these is 
logically true, and the rest logically false!? 

1 Tarksi expressed this view in a lecture ‘What are logical notions?’ delivered in London 
on 16 May 1966. The basic idea of that lecture was that the logical notions are those 
which are invariant under every one-one transformation of the ‘universe of discourse’ 
onto itself (which goes back to a paper of Lindenbaum and Tarski of 1935: see Tarski 
[1956], chapter XII). But this idea cannot settle the question of whether ‘e is a logical 
notion. 


* To say, for example, that there are exactly two things, we can write: (dx)Gy)(x # y & 
(a = xv z = y). 

Wittgenstein regarded any such proposition, Russell’s Axiom of Infinity included, 
as a nonsensical pseudo-proposition which was trying to say what could only be shown: 
see his [1922], 4:1272 (also 2-022~2:023 and 5:534-5:535). Ramsey thought such pro- 
positions, including Russell’s Axioms of Infinity.and Choice (though not the Axiom of 
Reducibility), were either tautologous or contradictory, though the human mind may never 
be able to discern which: see his [1931], pp. 57-61. 

The rationale of Ramsey’s view appears to be this. The truth or falsehood of existential 
claims like this hinges on the cardinality of the domain of interpretation. By varying the 
cardinality of the domain, we can make any such statement come out true in one 
interpretation and false in another. But suppose we insist that part of the definition 
of a language is the domain over which the quantifiers are to range, so that all interpreta- 
tions of any sentence of that language must have the same domain. Then any existential 


106 Alan Musgrave 


Russell did not think so, and for a simple reason.’ Each of these state- 
ments makes a specific existential claim which is false in some possible 
worlds, whereas “Pure logic...aims at being true...in all possible 
worlds, not only in this higgledy-piggledy job-lot of a world in which 
chance has imprisofed us” (Russell [1919], p». 192). It was for precisely 
this reason that Russéll refused to count his Axioms of Infinity, Choice, 
and Reducibility as logical truths. When he wrote the Principles of 
Mathematics Russell still hoped that the Axiom of Infinity might be proved 
from logic. But he came to regard it “as an example of a proposition which, 
though it can be enunciated in logical terms, cannot be asserted by logic 
to be true”.? In Principia Mathematica the Axioms of Infinity, Choice, 
and Reducibility were said not to be logically necessary propositions, but 
rather propositions which “can only be legitimately believed or disbélieved 
on empirical grounds”. 

This verdict of Russell’s is preserved when we transform his rather 
vague Leibnizian talk about ‘truth in all possible worlds’ into our more 
precise semantical definition of logical truth. That definition states, roughly, 

` that a statement is logically true if it comes out true in all interpretations in 
all (non-empty) domains.* Now everyone agrees that Russell’s problematic 


claim of our language will be either true in all interpretations, hence logically true, or 
false in all interpretations, hence logically false. 

But this is to make the notion of logical truth relative to language in an extreme fashion 
(though if you operate mistakenly with only one language, as Wittgenstein did in the 
Tractatus, the relativity is not apparent). On this view there are infinitely many firstt 
order languages (one whose quantifiers range over one-element domains, a second whose 
quantifiers range over two-element domains, and so on). And as well as the formulas 
whieh are logical truths in all of these, there is an infinite sequence of formulas each 
of which comes out logically true in exactly one language and logically false in all others. 
It seerns to me that we should avoid definitions of ‘language’ (hence of ‘interpretation’ 
and of ‘logical truth’) which have such odd results. 

1 See Russell [1903], Introduction to the second edition, p. viii (Russell tells us on p. v 
that most of the book was written in 1900). On the inadequacy of proposed ‘proofs’ of 
the Axiom, see Russell [rg19], chapter XIII. 

* Russell [1919], pp. 202-3. The same applied to the Axioms of Choice and Reducibility 
(Russell [1919], pp. 117, 191), and indeed, to all ‘existence theorems’ (Russell [1903], 
Introduction to the second edition, p. viii), Russell even came to regard it as “‘a defect 
in logical purity” that his logical axioms implied the existence of at least one thing 
(Russell [1919], p. 203, footnote); for this anomaly, see the next footnote but one. 

* Russell and Whitehead [1910-13], volume II, p. 183 (on the Axiom of Infinity); see 
also volume I, p. 62 (on the Axiom of Reducibility), and volume I, p. 504 (on the Axiom 
of Choice). The ‘empirical’ or ‘inductive’ grounds for believing these axioms included 
the fact that true mathematical statements could be derived from them. Originally 
mathematics was to be saved from scepticism by being derived from trivially true logic— 
now ‘logic’ is to be saved from scepticism by having trivially true mathematics derived 
from it (see Lakatos [1962], pp. 174-8. 

“The restriction of interpretations to non-empty domains is the source of the minimal 
existential logical truth ‘There is at least one thing’, since it renders valid the argument 
from the logical truth “(x)(Fx v — Fx} to ‘(4x)(Fx v —Fx)’. Removing this ‘defect in 
logical purity’ leads to the so-called ‘free logica’. 
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axioms, or the axioms of set theory, are not logical truths in this sense of 

the term. For example, when we prove the independence of the usual 

axioms for set theory, we find for each of them an.interpretation in a non- 
empty domain in which it comes out false while the rest come out true.) 

This is why these axioms @re classed as ‘proper’ or ‘mathematical’ axioms, 

and not as logical ones. 3 

It might be objected that this argument is much too swift, since we can 
also prove the independence of admittedly logical axioms. To do this we 
provide unintended ‘interpretations’ of them in which the intended 
meaning of the logical terms (given by the usual rules of interpretation) 
is changed. Now, it might be argued, the axioms of set theory implicitly 
define the intended meaning of ‘e. Hence any ‘interpretation’ which 
falsifies one of those axioms must be one in which the intended meaning 
of ‘e’ is changed. If we confine ourselves to intended interpretations, 
then the axioms of set theory will come out true in all interpretations and 
hence logically true. 

But the argument is obviously circular. If the axioms of set theory 
implicitly define ‘e’, then trivially any interpretation which falsifies one 
of those axioms must distort its meaning. In this way any (consistent) 
set of axioms could be deemed logically true. I see no reason to suppose 
that the meaning of ‘e has been changed in an interpretation which 
falsifies, say, the Power-Set Axiom.? Moreover, Cohen’s results show that 
there are alternative, equally consistent, set theories, one in which the 

Generalized Continuum Hypothesis is an axiom, and another in which 
its negation is an axiom.? We cannot, on pain of contradiction, deem all 
set-theoretical axioms logically true. And how could we defend the claim 

that one of these theories has logically true axioms, and the other a 

logically false one? 

1 Indeed, it is a theorem of Zermelo-Fraenkel set theory (ZF) that, if ZF is consistent 
then for any axiom A of ZF there is a set in which all the axioms are true except A. 

2 Each of the existential axioms of set theory can be falsified by assigning to the ‘«’- 
relation a proper subset of the ‘intended extension’ of that relation. It is different with 
the Axiom of Extensionality, which states that two sets are identical if they have the 
same members and which makes no existence claim. Any interpretation in a universe 
of sets which falaifies this axiom could be said to involve a change in the meaning of ‘e’, 
since this axiom can be described as a partial implicit definition of ‘¢’. (Another exception 
might be the Axiom of Foundation, which excludes, among other things, any set being 
a member of itself.) It is the existential axioms which are problematic: to say that these 
help to implicitly define ‘e is to say that we do not really know what ‘e’ means before 
we know which sets exist. If we applied this view to the universal quantifier, we would 
hold that any variation in the cardinality of the domain yields a non-standard interpreta- 
tion which which changes the meaning of ‘All’. And this would lead to the Wittgenstein- 
Ramsey view discussed in n. 2, p. 105, above. 

3 For a non-technical account which exploits the analogy with alternative geometries, see 


Cohen and Hersh [1967]. The same poist applies to the various set theories obtained by 
adjoining ‘Strong Axioms of Infinity’ to the usual axioms. 
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The conclusion I draw from all this is that set theory (or Russell’s type 
theory with the additional axioms) ought not to be counted a branch of 
logic. To extend the customary list of logical potions to include ‘e€ smacks 
of arbitrary fiat. But more importantly, even if we do count ‘e’ logical it is 
only by deforming the customary notion of logical truth that the axioms 
of set theory (or Russell’s additional axioms) can be counted logical truths. 
The logicists did not achieve their declared aim. Their great, if un- 
intended, achievement was the reduction of classical mathematics to set 
theory (or to Russell’s theory), both fundamental mathematical theories. 
This conclusion is far from new; indeed, many will feel that I have been 
labouring the obvious. Mostowski, for example, writes (in his [1965], p. 7): . 
The logicism of Frege and Russell tries to reduce mathematics to logic. This 
seemed an excellent programme, but when it was put into effect, it turned out 
that there was simply no logic strong enough to encompass the whole of mathe- 
matics. Thus what remained from this programme is a reduction of mathematics 
to set theory. This can hardly be said to be a satisfactory solution to the problem 


of foundations of mathematics since among all mathematical theories it is just 
. the theory of sets that requires clarification more than any other. 


The Kneales agree, saying that once Russell had to postulate the Axiom 
of Infinity the logicist thesis was destroyed (Kneale and Kneale [1962], 
p. 699): 

There is something profoundly unsatisfactory about the axiom of infinity. It 
cannot be described as a truth of logic in any reasonable use of that term and so 


the introduction of it as a primitive proposition amounts in effect to the abandon- 
ment of Frege’s project of exhibiting arithmetic as a development of logic. 


Quotations like this could be multiplied. Even the early logicists themselves 
seem to have reached the same verdict. At any rate, Frege gave up the 
attempt to base arithmetic upon logic, and tried instead to give classical 
mathematics a geometrical foundation. Even Russell, in his pessimistic 
moments, confessed that it was he and not ‘the philosophers’ who had to 
admit defeat. Being Russell, he did it gracefully; reflecting on his eightieth 
birthday, he saw the main achievement of his intellectual life in the 
following terms (Russell [1969], p. 220): 


I wanted certainty in the kind of way in which people want religious faith. 
I thought that certainty is more likely to be found in mathematics than elsewhere. 
. . . But as the work proceeded, Iwas continually reminded of the fable about 
the elephant and the tortoise. Having constructed an elephant upon which the 


1 By 1924 Frege had come to the conclusion that “the paradoxes of set theory have 
destroyed set theory”. He continued: “The more I thought about it the more convinced 
I became that arithmetic and geometry grew from the sgme foundation, indeed from the 
geometrical one; so that the whole of mathematics is actually geometry”. (These two 
remarks are quoted by Bynum in his Introduction to Frege [1972]; cf. pp. 53-4.) 
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mathematical world could rest, I found the elephant tottering, and proceeded to 
construct a tortoise to keep the elephant from falling. But the tortoise was no 
more secure than the elephant, and after some twenty years of very arduous toil, 
I came to the conclusion that there was nothing more that J could do in the way 
of making mathematical knowledge indubitable. Then came the First World 
War, and my thoughts bécame concentrated on human misery and folly. 


This makes the historical riddle with which I began all the more 
puzzling. If the logicist programme was a dead-duck by about 1920, 
how could the positivists make the logicist thesis a cornerstone of their 
position in the 1920s? 


“4 LOGICISM RESCUED BY THE IF-THENIST MANOEUVRE 


It was- actually Russell who found a way to rescue logicism from defeat; 
and the key to it was provided by the problem Of assimilating geometry to 
logic. Frege had actually excluded geometry from the logicist thesis, and 
had endorsed Kant’s view of it: 


I consider KANT did great service in drawing the distinction between synthetic 
and analytic judgements. In calling the truths of geometry synthetic and a priori, ` 
he revealed their true nature. (Frege [1884], section 89, pp. 101-2) 

Russell, on the other hand, thought that the discovery of non-Euclidian 
geometries had undermined Kant’s original position, and in his first major 
publication he tried to rescue it. In his Foundations of Geometry of 1897, 
Russell sought what was common to Euclidean and non-Euclidean 
systems, found it in the axioms of projective geometry, and took a Kantian 
view of them. As for the additional axioms which distinguished Euclidean 
from non-Euclidean systems, these were empirical statements (Russell 
[1897], Introduction, section 9). But after he had adopted the logicist 
thesis, Russell sought a way to bring geometry into the sphere of logic. 
And he found it in what I shall call the [f-thentst manoeuvre: the axioms 
of the various geometries do not follow from logical axioms (how could 
they, for they are mutually inconsistent?), nor do geometrical theorems; 
but the conditional statements linking axioms to theorems do follow from 
logical axioms. Hence geometry, viewed as a body of conditional statements, 
is derivable from logic after all. As Russell himself put it (in the Intro- 
duction to the second edition of his [1903], p. vii): 

It was clear that Euclidean systems alike must be included in pure mathematics, 
and must not be regarded as mutually inconsistent; we must, therefore, only 


assert that the axioms imply the propositions, not that the axioms are true and 
therefore that the propositions are true. 


Russell argued that the discovery of non-Euclidean geometries forced us 
to distinguish pure geometry, a branch of pure mathematics whose 
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assertions are all conditional, from applied geometry, a branch of empirical 
science. After describing the emergence of non-Euclidian geometry, he 


says (Russell [1903], P» 373): 


Geometry has become . . . a branch of pure mathematics, that is to say, a subject 
in which the assertions ¢ are that such and such cnsequences follow from such 
and such premisses, not that entities such as the premisses describe actually exist. 
That is to say, if Euclid’s axioms be called A, and P be any proposition implied 
by A, then, in the Geometry which preceded Lobatchewsky, P itself would be 
asserted, since A was asserted. But nowadays, the geometer would only assert 
that A implies P, leaving A and P themselves doubtful. 


In this way the axioms of the various geometries cease to be problematic _ 
for the logicist, because they cease to be asserted as axioms at all (let alone 
asserted to be derivable from logical axioms): : 


The so-called axioms of geometry, for example, when Geometry is considered 
a branch of pure mathematics, are merely the protasis in the hypotheticals which 
constitute the science. They would be primitive propositions if, as in applied 
mathematics, they were themselves asserted; but so long as we only assert 
` hypotheticals . . . in which the supposed axioms appear as protasis, there is no 
reason to assert the protasis, nor, consequently, to admit genuine axioms. 
(Russell [1903], p. 430) 

Russell’s If-thenist construal of geometry does, in fact, have a long history. 
Descartes, anxious to render mathematical truths immune from most 
sceptical attacks, hinted in the First Meditation that they are all con- 
ditional and hence do not assert existence: 


... Arithmetic, Geometry and other science of that kind which only treat of 
things . . . without taking any great trouble to ascertain whether they are actually 
existent or not, contain some measure of certainty and an element of the in- 
dubitable. (Descartes [1911], volume I, p. 147) 


Locke agreed: 


All the discourses of the mathematicians about the squaring of the circle, conic 
sections, or any other part of mathematics, concern not the existence of any of 
these figures: but their demonstrations, which depend on their ideas, are the 
same, whether there be any square or circle existing in the world or no. (Locke 
[1690], book IV, chapter iv, section 8) 


And Leibniz, that great forerunner of logicism, echoed the point: 


As to eternal truths, it is to be noted that at bottom they are all conditional, and 
say in effect; Granted such a thing, such another thing is. For instance, when I 
say ‘Every figure which has three sides will also have three angles’, I say nothing 
but this, that supposing there is a figure with three sides, this.same figure will 
have three angles. (Leibniz [1916], book IV, chapter 11, section 14) 


Not only does Russell’s position have a long history. It is also a position 
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which is widely accepted at least as regards geometry by many who would 
not regard themselves as logicists or logical empiricists. ; 
Now having already applied the If-thenist manoeuvre to the problematic 
axioms of geometry, it was natural for Russell to apply it also to the 
problematic Axioms of Reducibility, Infinity, and, Choice.1 These were 
not logical truths, since they made specific existence claims which were 
false in some ‘possible worlds’. Hence postulating them as axioms would 
mean the end of the logicist programme. But conditional statements 
linking them to ‘theorems’ derivable from them will still be derivable 
from logic provided that the derivation of the ‘theorem’ from the ‘axioms’ 
- was correct. This is exactly the course which Russell took. In ‘Mathe- 
matical Logic as Based on the Theory of Types’, written in 1908, he 
confessed that he could not prove the Axiom pf Choice from logic and 
would therefore “‘state it as a hypothesis on every occasion on which it is 
used” (see Russell [1956], p. 99). The same manoeuvre occurs in Principia 
Mathematica regarding the problematic ‘axioms’. Of the Axiom of Choice 
Russell and Whitehead say (Russell and Whitehead [1910-13], volume I, | 
P- 504): 
We have not assumed its truth in the general [non-finite] case where it cannot 
be proved, but have included it in the hypotheses of all propositions which 
depend upon it. 


And of the Axiom of Infinity they write (Russell and Whitehead [1910-13], 
volume II, p. 183): 


This assumption, like the multiplicative axiom [Axiom of Choice], will be ad- 
duced as a hypothesis wherever it is relevant. It seems plain that there is nothing 
in logic to necessitate its truth or falsehood, and that it can only be legitimately 
believed or disbelieved on empirical grounds. 


Finally, Russell claimed that the If-thenist manoeuvre must be applied 
to any principle which is problematic from a logicist point of view: 


...no principle of logic can assert ‘existence’ except under a hypothesis... 
Propositions of this form, when they occur in logic, will have to occur as 
hypotheses or consequences of hypotheses, not as complete asserted pro- 
positions . , . (Russell [1919), p. 204) 


Clearly this would apply to all the problematic (existential) axioms of 


1 Even Frege had a brief flirtation with the idea. He amended his Basic Law (V) to avoid 
Russell’s paradox. He then suggested that his amended law could be insulated from 
sceptical doubt if it were never asserted but rather always made the antecedent of con- 
ditional theorems, concluding “‘. . . even now I do not see how arithmetic can be scien- 
tifically founded, how numbers can be concéived as logical objects, unless we are allowed 
—at least conditionally—the txansition from a concept to its extension” (Frege [1964], 
p. 127). Clearly, Frege never took the If-thenist manoeuvre too seriously—but Russell 
and the logical positivists were more enthusiastic about it. 
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set theory, so that set theory construed as a body of conditional statements 
might be shown to be derivable from logic as the logicist thesis requires. 

By using the If-themist manoeuvre, Russell arrives at a position which 
is far-removed from his original logicism. The claim that all so-called 
mathematical axioms can be deduced from logical axioms (thesis (B) of 
old-style logicism) is weakened to read: 


(B*) either an apparently primitive proposition of mathematics can be 
deduced from logical axioms or it is not to be regarded as a primitive 
proposition at all but only as the antecedent of various conditional 
statements (all of which are derivable from logic in view of thesis (C)). 


It turned out, in fact, that only a fragment of arithmetic (finite arithmetic) 
could be ‘reduced to logig’ in the way that old-style logicism demands.! 
The rest of mathematics could be ‘reduced to logic’ only if the If-thenist 
manoeuvre was applied to it first. 

I think it fair to say that Russell never fully realised how far this new 
_ position was from logicism proper. And there was a special reason for this: 
his failure to distinguish a rule of inference, a conditional statement of the 
form ‘If A then B’. and a universal statement of the form ‘(x)(Fx > Gx)’. 
Some mathematical axioms fall into the third category. Applying the If- 
thenist manoeuvre results in statements of the second category. If we 
identify the two, we can still suppose that, even after adopting the If- 
thenist manoeuvre, we are deriving mathematical axioms from logic. 
And if we confuse both of these with rules of inference, we can suppose 
that we are deriving them from rules of inference. Because of these con- 
fusions, one always finds If-thenism rubbing shoulders with logicism 
proper in Russell’s writings. As far back as 1901 we find a passage often 
smiled over but seldom understood, which immediately precedes the state- 
ment of logicism proper which I quoted on page 100 above: 


Pure mathematics consists entirely of assertions to the effect that, if such and 
such a proposition is true of anything, then such and such another proposition 
is true of that thing. It is essential not to discuss whether the first proposition 
is really true, and not to mention what the anything is, of which it is supposed 
to be true. Both these points would belong to applied mathematics. We start, 
in pure mathematics, from certain rules of inference, by which we infer that 
tf one proposition is true, then so is some other proposition. These rules of 
inference constitute the major part of the principles of formal logic: We then 
take any hypothesis that seems amusing, and deduce its consequences. If our 
hypothesis is about anything, and not about some one or more particular things, 


1 As is admitted by Russell in the Introduction to the second edition of his [1903], p. viii. 
Not even Peano’s axioms for arithmetic can be ‘derived from logic’ in the original 
logicist sense: that axiom which states that ho two numbers have the same successor 
requires the ‘Axiom of Infinity’ for its proof (see Russell [1919], pp. 131-2). 
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then our deductions constitute mathematics, Thus mathematics may be defined 
as the subject in which we never know what we are talking about, nor whether 
what we are saying is true. People who have been puzgled by the beginnings of 
mathematics will, I hope, find comfort in this definition, and will probably agree 
that it is accurate, (Russell [x917], p. 75) x 


It would be tedious to trace Russell’s confusion through this passage, or 
through the similar passages at the outset of his Principles of Mathematics 
(see Russell [x903], chapter 1). Nor should we be too hard on Russell 
for them. I mention them only because they often blinded him to the 
difference between his original thesis and his final one. 


I now turn to the logical positivists, and to our historical riddle. The 
solution to the riddle is this: it was not old-style logicism which the 
positivists adopted, but rather logicism spiced with varying doses of 
If-thenism. Mind you, the rhetoric of old-style logicism persists, and is 
used as a stick to beat philosophical opponents. In 1930 Carnap begins by 
telling us that Whitehead and Russell had confirmed Frege’s view that 
“mathematics is to be considered a branch of logic”, in the following ` 
way: 


It was shown that...every mathematical sentence (insofar as it is valid in 
every conceivable domain of any size) can be derived from the fundamental 
statements of logic... all the...sentences of arithmetic and analysis (to the 
extent that they are universally valid in the widest sense) are provable as sentences 
of logic. (Carnap [1930], pp. 140-1.) 


The qualifications are, of course, crucial. The reader may wonder about 
all those mathematical sentences which are not “universally valid in the 
widest sense”, which are not “provable as sentences of logic”, and which 
make up the greater part of mathematics. Carnap says nothing to enlighten 
him about these. Instead, two pages later the unqualified thesis of old-style 
logicism is used as a stick with which to beat the opponents of empiricism: 


Mathematics, as a branch of logic, is also tautological. In the Kantian termin- 
ology: The sentences of mathematics are analytic. They are not synthetic 
a priori. Apriorism is thereby deprived of its strongest argument. Empiricism, 
the view that there is no synthetic a priori knowledge, has always found the 
greatest difficulty in interpreting mathematics, a difficulty which Mill did not 
succeed in overcoming. This difficulty is removed by the fact that mathematical 
sentences are neither empirical nor synthetic a priori but analytic. (Carnap 
[1930], p. 143) 


1 Especially not when we reflect that the many upholders of the so-called ‘inference- 
license’ view of universal statements are victims of the same confusion. And when 
we reflect also that the very great difference between old-style logicism and If-theniam 
is frequently overlooked: for two examples among many see Pap [1949], pp. 108-9, or 
Robinson [1964], pp. 83 and 85. 
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Carnap is more forthcoming about the breakdown of the original logicist 
programme and how he proposes to deal with it a year later, in his paper 
“The logicist foundations of mathematics’. Again, the opening rhetoric 
is that of old-style logicism: we are told, for example, that “The theorems 
of mathematics can be derived from logical axioms’ through purely logical 
deduction” (Carnap [1931], p. 31). But it soon transpires that ‘theorem’ 
does not mean what mathematicians standardly mean by it when they speak, 
for example, of the Prime Number Theorem or Pythagoras Theorem. 
For Carnap tells us of the discovery of the paradoxes, the theory of types, 
and the necessity to introduce the Axioms of Infinity and Choice. Then he 


continues: 


Russell was right in hesitating to present them as logical axioms, for logic... 
cannot make assertions about whether something does or does not exist. Russell 
found a way out of this difficulty. He reasoned that since mathematics was also 
a purely formal science, it too could make only conditional, not categorical, 
Statements about existence: if certain structures exist, then there also exist 
certain other structures whose existence follows logically from the existence of 
‘the former. For this reason he transformed a mathematical sentence, say S, the 
proof of which required the axiom of infinity, J, or the axiom of choice, C, 
into a conditional sentence; hence S is taken to assert not S, but I > Sor C > S, 
respectively. This conditional sentence is then derivable from the axioms of 


logic. (Carnap [1931], PP- 34-5) 

Here, then, Carnap takes Russell’s way out of the dilemma. The problem- 
atic Axioms of Infinity and Choice (or of set theory, or of geometry) 
cease to be problematic because they cease to be axioms at all. 

Carnap changed his mind later, however, and decided that the Axioms of 
Infinity and Choice were analytic after all. He justified this in the case 
of the Axiom of Infinity by taking it to assert the existence, not of infinitely 
many objects, but of infinitely many posttions in space (see Carnap [1937], 
pp. 141-2). It is unclear to me (and to Russell and Copi) why “the 
existence of infinitely many positions is less an empirical question than the 
existence of infinitely many objects” (Copi [1971], p. 67). And even if we 
accept Carnap’s view of the Axiom of Infinity, the problematic Axiom 
of Choice remains. Yet in 1939 Carnap returns to the old-style logicist 
thesis that “all mathematical signs become logical signs, all mathematical 
theorems L-true propositions” (Carnap [1939], p. 48). This does not 
apply, however, to the theorems of the various geometries, to which the 
If-thenist manoeuvre is applied.* | 

This later position of Carnap’s is also endorsed by Hempel in his paper 


1 The same position seems to have been adopted by Behmann in his [1934]. 
2 See Carnap [1963], pp. 49-50; on pp. 47-8 Carnap reaffirms his view that the Axioms 
of Infinity and Choice are analytic. 
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‘On the nature of mathematical truth’ of 1945. As we already saw, Hempel 
classes ‘e’ as a logical term (without argument), and concludes thatthe 
Axiom of Infinity “may be ‘treated as an additional postulate of logic” 
on the (unargued) ground that “‘it is capable of expresgion in purely logical 
terms” (the same applies, it turns out, to the Axiom of Choice). Hempel 
then propounds: 

. .. the thesis of logicism concerning the nature of mathematics: 

Mathematics is a branch of logic. It can be derived from logic in the following 
sense: 
_a, All the concepts of mathematics, i.e. of arithmetic, algebra, and analysis, 
can be defined in terms of four concepts of pure logic. 

b. All the theorems of mathematics can be deduced from those definitions by 
means of the principles of logic (including the axioms of infinity and choice). 
(Hempel [1945], pp. 377-8. 

But Hempel is aware of the problems posed for old-style logicism by 

geometry and related fields—‘mathematics’ here does not include: 

... those mathematical disciplines which are not outgrowths of arithmetic and 

thus of logic; these include in particular topology, geometry, and the various 

branches of abstract algebra, such as the theories of groups, lattices, fields, etc. 

Each of these disciplines can be developed as a purely deductive system on the 

basis of a suitable set of postulates. If P be the conjunction of the postulates 

for a given theory, then the proof of a proposition T of that theory consists in 
deducing T from P by means of the principles of formal logic. What is estab- 
lished by the proof is therefore not the truth of 7, but rather the fact that T is 

true provided that the postulates are. (Hempel [1945], p. 380) 


Thus Hempel does not apply the If-thenist manoeuvre to the Axionis of 
Infinity and Choice—but he does apply it to the ‘axioms’ of large portions 
of mathematics. 

There were, or course, other positivists who merely affirmed old-style 
logicism without mentioning its difficulties. Habn, writing in 1933, 
contents himself with the declaration that, despite appearances, all mathe- 
matical propositions are tautologies, true by virtue of the meanings of the 
signs they contain. He seems barely aware of the difficulties logicism had 
encountered: 

To be sure, the proof of the tautological character of mathematics is not yet 
complete ‘in all details. This is a difficult and arduous task; yet we have no 
doubt that the belief in the tautological character of mathematics is essentially 
correct. (Hahn [1933], p. 158) i 

Ayers Language, Truth and Logic, first published in 1936, contains a 
similar account. Mathematical propositions are, he says, all analytic: 

... the criterion for an analytic proposition is that its validity should follow 


H 
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simply from the definition of the terms contained in it,... this condition is 
fulfilled by the propositions of pure mathematics. (Ayer [1936], p. 82) 


Geometrical propositions are, however, treated in accordance with the 
If-thenist manoeuvre (see Ayer [1936], pp. 26, 82-4). Ayer does dissent 
from old-style logicism on one point: 


A point which is not sufficiently brought out by Russell . . . is that every logical 
proposition is valid in its own right. Its validity does not depend on its being 
incorporated in a system, and deduced from certain propositions which are 
taken as self-evident... The fact that the validity of an analytic proposition in 
no way depends on its being deducible from other analytic propositions is our 
justification for disregarding the question whether the propositions of mathe-. 
matics are reducible to propositions of formal logic, in the way that Russell 
supposed. For even if... it is not possible to reduce mathematical notions to 
purely logical notions, it will still remain true that the propositions of mathe- 
matics are analytic propositions. They will form a special class of analytic 
propositions, containing special terms, but they will be none the less analytic 
for that. (Ayer [1936], pp. 81-2) 


` Here Ayer dissents frdm the thesis that all mathematical notions are 
definable from purely logical ones (thesis (A) of old-style logicism). But 
he does not, I think, dissent from the other two theses. His idea seems to 
be that mathematical statements are like “All men are men”, which con- 
tains a term (‘men’) not definable in logical terms but which is still logically 
true; or like “All bachelors are unmarried”, which also contains non- 
logical terms but which becomes a logical truth when we replace a defined 
term (‘bachelor’) by its definiens (‘unmarried man’). The derivability of 
mathematics from logic is not denied. Ayer merely insists that a truth is 
not analytic because it is derivable from logic, but because of its logical 
form: 


For it is possible to conceive of a symbolism in which every analytic proposition 
could be seen to be analytic in virtue of its form alone. (Ayer [1936], p. 81) 


Like Hahn, then, Ayer simply ignores the breakdown of old-style logicism.? 


2 Donald Gillies takes a different view of the Hahn~Ayer position, in an unpublished 
paper of his called ‘Logicism and the Logical Positivists’ which was stimulated by an 
earlier version of the present paper and of which he kindly sent me a copy. Gillies thinks 
that the ‘analytic view of mathematics’, the view that mathematical propositions are 
true by virtue of the meanings of the words they contain, is a different view from logicism. 
He admits that the view is vague, and hopes to make it less so by developing.a Wittgen- 
steinian theory of meaning. I doubt that he will succeed. I think the only way to make 
‘analyticity’ precise is to identify the analytic statements with statement which are either 
(a) logical trutha, or (b) statements which become logical truths when conventionally 
defined terms are replaced by their defining terms. (It is perhaps worth adding that 
none of Quine’s strictures against ‘analyticity’, in his famous [1951], apply to such a 
construal of it: Quine merely points out that what is “analytic’ in a natural language is 
vague, because what is conventionally defined in terms of what in natural language is 
vague. Quite so. But I have never quite understood how thts shows that logical truths, as 
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I conclude that, in so far as the logical positivists had a defensible 
philosophy of mathematics at all (old-style logicism not being defensible), 
it was logicism spiced with varying amounts of If-thenism. Now this view 
of mathematics is a far legs potent philosophical weapon than old-style 
logicism. The latter sought to show that, appearapces notwithstanding, 
statements like “There are infinitely many primes” are logical truths, 
hence true in all possible worlds, hence factually empty. If this had 
succeeded, it would have been a very powerful argument indeed for the 
basic thesis of logical empiricism, the analytic/synthetic dichotomy. But it 
failed, and with the position the positivists actually adopted (their rhetoric 
-aside) the situation is very different. The positivist confronts statements 
like “There are infinitely many primes”, sees that they are neither syn- 
thetic nor analytic truths as hes basic thests requires, and therefore refuses 
to regard them as assertible statements at all. Using the If-thenist device, 
he construes all apparent assertions of statements which upset his central 
dogma as disguised conditionals, and claims that these are logically true. 


This is not an argument for the central dogma of positiviam—it is a result . 


of applying it to problematic cases. 

So applying the If-thenist manoeuvre gives us a position which is of 
less philosophical interest than its predecessor. Some might be tempted 
to say that old-style logicism was a bold and exciting thesis which sadly 
turned out to be wrong, while its offspring is a puny imitation of it gene- 
rated out of an ad hoc device to save the parent from defeat. But philoso- 
phers ought perhaps to love truth more than excitement. And they might 
well reply that the offspring, though less exciting than its parent, has the 
great advantage of being true. I discuss this question in my last section. 


5 IF-THENISM 


So far we have only considered the If-thenist manoeuvre in its historical 
setting. Thus considered, it does seem ad hoc, being applied piecemeal 
only to mathematical statements which turned out to be problematic for 
the old-style logicist. It is high time that we removed it from this historical 
setting, and let it stand on its own two feet. We then arrive at a more 
thoroughgoing position, which can be expressed by the following two 
claims: . 


(F) a mathematical statement is a conditional statement with a conjunction 





opposed to views about what is ‘analytic’ in some natural language, are open to revision 
in the light of empirical evidence.) 

But whether or not Gillies suéceeds in making his Wittgensteinian view of ‘analyticity’ 
precise, I doubt that it was the view of Hahn or Ayer. They simply ignored, or were 
ignorant of, the collapse of old-style logicism. 
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of ‘mathematical axioms’ as antecedent and a ‘mathematical theorem’ 
° as consequent; . ' 

(G) all true mathematical statements can be deduced from logical axioms. 
I will call this position Jf-thenism. And I ‘will. call (F) the Zf-thenist 
prohibition, since it ptohibits the pure mathematician from asserting the 
truth of any of his ‘axioms’ or ‘theorems’, (Anybody who asserts the truth 
of a mathematical ‘axiom’ or ‘theorem’ is, according to (F), an applied 
mathematician and his assertion is at bottom an empirical one.) 

If-thenism is a much weaker position than old-style logicism. All that 
remains of old-style logicism is claim (C), which has become claim (G)._ 
The two are equivalent by virtue of the Deduction Theorem: a mathe- 
matical ‘theorem’ is deducible from logical axioms together with (closed) 
mathematical ‘axioms’ just in case the conditional statement linking the 
mathematical ‘axioms’ to the mathematical ‘theorem’ is deducible from 
logical axioms alone. Claim (C), or claim (G), is not a trivial claim: what 
it says is that all mathematical proofs can be formalised. The If-thenist 
` will maintain that a real achievement of the early logicists was to have 
shown that claim (G) is correct. (A second real achievement was the 
unification of classical mathematics under set theory or the theory of types, 
which the If-thenist will regard as a primarily mathematical achievement.) 

But an If-thenist will regard the other enterprises of the early logicists 
as misguided. The early logicists tried to establish their claims (A) and (B) 
for Peano’s arithmetic. But an If-thenist does not need to try to define 
Peano’s primitive notions in.logical terms, nor does he need to try to 
derive Peano’s axioms from logical axioms. All that the If-thenist needs to 
do in order to bring arithmetic into the sphere of logic is show that Peano’s 
‘theorems’ really can be formally derived from his ‘axioms’. Similarly, all 
the logicists worries about the axioms of Infinity and Choice (or the axioms 
of set theory) are misguided from an If-thenist point of view. The mathe- 
matician merely derives ‘theorems’ from these ‘axioms’, he does not assert 
that the ‘axioms’ are true (let alone logically true). 

So If-thenism is weaker than logicism proper—but is it true? At the 
heart of it is the If-thenist prohibition, (F), which says that the pure 
mathematician does not assert the truth of his ‘axioms’ or ‘theorems’, 
but only that of conditionals linking the two. Now I have not been around 
with a tape-recorder, but I suspect that many working mathematicians 
would not accept this prohibition; which turns them into rather sophisti- 
cated logicians. However, sociological facts about mathematicians are 
not philosophical arguments. An If-thenist need not be too impressed, 
even if an exhaustive survey of mathematicians should reveal that not one 
of them accepts his philosophy. He might reply that, just as fish are good 
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swimmers but are not much good at hydrodynamics, so also good mathe- 
maticians are not much good at the philosophy of mathematics. 

Mathematicians might, however, have a good reason for rejecting the 
If-thenist view of mathematics. For it applies, straightforwardly, only to 
axiomatised portions of mathematics. But mathematicians do creative 
work in areas which have not yet been axiomatised: think of geometry 
before Euclid (or perhaps Hilbert), analysis before Cauchy (or perhaps 
Weierstrass), or arithmetic before’ Peano (or perhaps Frege). If-thenism 
has nothing to say about un-axiomatised or pre-axiomatised mathematics, 
in which many creative mathematicians work. Therefore, even if its 
account of axiomatised mathematics is acceptable, as an account of 

“mathematics as a whole it is seriously defective. 

One: philosopher of mathematics would have accepted this argument, 
and taken it even further. Imre Lakatos, in the Introduction to his Proofs 
and Refutations, attacks what he calls ‘formalism’, the identification of 
mathematics with formally axiomatised systems (and of the philosophy of 
mathematics with meta-mathematics or the study of such systems). 
According to Lakatos, formalism excludes from consideration all creative, ` 
growing mathematics. A mathematical theory can be formally axiomatised 
only after the creative mathematical work is done and the theory has 
ceased to grow: formal axiomatisation is, one might say, the kiss of death 
which turns a living thing into a museum piece. Lakatos does not deny 
that creative mathematics can be done about formally axiomatised theories 
by meta-~mathematicians: but the formally axiomatised theory is the 
subject-matter, and the work is done in an informal meta-mathematical 
theory: 

Nobody will doubt that some problems about a mathematical theory can only 
be approached after it has been formalized, just as some problems about human 
beings (say concerning their anatomy) can only be approached after their death. 
But few will infer from this that human beings are ‘suitable for scientific in- 
vestigation’ only when they are ‘presented in “dead” form’, and that biological 
investigations are. confined in consequence to the discussion of dead human 
beings—although, I should not be surprised if some enthusiastic pupil of 
Vesalius in those glorious days of early anatomy, when the powerful new 
method of dissection emerged, had identified biology with the analysis of dead 
bodies. (Lakatos [1976], p. 3, note 3) 

Now since If-thenism applies straightforwardly only to axiomatised 
theories, Lakatos would presumably regard it, not as a philosophy of 
mathematics, but as a philosophy of ‘dead mathematics. We saw how 
If-thenism grew historically out of the basic dogma of logical empiricism. 
Lakatos claims that the exclusion of informal mathematics from mathe- 
matics stems from the same source: 
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‘Formalism’ is a bulwark of logical positivist philosophy. According to logical 
positivism, a statement is meaningful only if it is either ‘tautological’ or empirical. 
Sinte informal mathematics is neither ‘tautological nor empirical, it must be 
meaningless, sheer norsense. The dogmas of Jogical positivism have been 
detrimental to the history and philosophy of mathematics. (Lakatos [1976], pp. 2-3) 
But does the existence of informal, pre-axiomatised mathematics present 
an insuperable obstacle to the If-thenist view? To answer this question, 
let us look briefly at Lakatos’s own account of informal mathematics, as 
presented in Proofs and Refutations. 

Lakatos describes how, often for quasi-empirical reasons, mathematicians 
get interested in certain mathematical entities: the geometer in plane 
figures or polyhedra, the arithmetician in prime numbers, the analyst in- 
areas under curves. They propose conjectures about these entities, and 
try to prove them. Both the conjectures and the proofs are criticised and, 
through intricate dialectical processes, improved. This process of trial and 
error (proof and refutation) results in a growing body of knowledge about 
the entities in question, organised in a more or less ramshackle deductive 
. structure. An axiomatiser may then come along and look for a small 
number of true statements (axioms) from which all the other known 
truths in the field (theorems) can be derived. 

What will an If-thenist say to an account such as this? He might well 
applaud it as a contribution to the history of informal mathematics, while 
insisting that it is philosophically question-begging. It speaks of the 
mathematician conjecturing, and trying to prove, that some categorical 
mathematical statement is true. Thus, for example, an arithmetician might 
conjecture, and try to prove, that there are infinitely many prime numbers. 
But the If-thenist denies that sense can be made of categorical claims like 
this, unless, of course, they amount to some sort of empirical claim. For 
him, to say that this proposition is true of the natural numbers is simply 
to say that it is deducible from the axioms which characterise the natural 
number sequence. Informal mathematics, so brilliantly depicted by 
Lakatos, is simply the process of creating axiomatic structures. And 
Lakatos’s ‘mathematical conjectures’ are, at bottom, logical conjectures: 
to use Lakatos’s own example, the Descartes—Euler conjecture that for 
all polyhedra V—E+-F = 2 is simply the conjecture that the concept of 
‘polyhedron’ can be defined in such a way that this proposition will be 
deducible from geometrical axioms together with the definition. Mathe- 
matical assertions, even the assertions of informal mathematics, are all 
disguised conditionals. What else, given that they are not empirical claims, 
could they be?? 


i Having said this, the If-thenist might go on to disassociate himself from some of the 
excesses of ‘formalism’ to which Lakatos rightly objects. Of course (he might say) 
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The If-thenist challenge implicit in this question is not an easy one to 
answer. As an indication of this, I will consider Quine’s quick dismissal 
of If-thenism. Using ‘Hunt (sphere, includes)’ to abbreviate Huntingdon’s 
geometrical axioms, and “T” (sphere, includes)’ tg abbreviate a theorem 
deducible from those axioms, Quine writes: 3 


But if as a truth of mathematics “T (sphere, includes)’ is short for ‘If Hunt 
(sphere, includes) then T (sphere, includes)’ still there remains, as part of this 
expanded statement, “T (sphere, includes)’; this remains as a presumably true 
statement within some body of doctrine, say for the moment ‘non-mathematical 
‘geometry’, even if the title of mathematical truth be restricted to the entire 
hypothetical statement in question. The body of all such hypothetical statements 
_ describable as ‘theory of deduction of non-mathematical geometry’ is of course 

a part of logic; but the same is true of any ‘theory of deduction of sociology’, 
‘theory. of deduction of Greek mythology’, etc., which we might construct in 
parallel fashion with the aid of postulates suited to saciology or Greek mythology. 
The point of view toward geometry which is under consideration thus reduces 
merely to an exclusion of geometry from mathematics, a relegation of geometry 
to the status of sociology or Greek mythology; the labelling of the ‘theory of 
deduction of non-mathematical geometry’ as ‘mathematical geometry’ is a verbal 
tour de force which is equally applicable in the case of sociology or Greek - 
mythology. To incorporate mathematics into logic by regarding all recalcitrant 
mathematical truths as elliptical hypothetical statements is thus in effect merely 
to restrict the term ‘mathematics’ to exclude those recalcitrant branches. But we 
are not interested in renaming. Those disciplines, geometry and the rest, which 
have traditionally been grouped under mathematics are the objects of the 
present discussion, and it is with the doctrine that mathematics in this sense 
is logic that we are here concerned. (Quine [1936], p. 327) 


Now Quine rightly points to the ad hoc character of If-thenism: as he puts 
it, it is a “verbal tour de force” by which “recalcitrant mathematical truths” 
are regarded as “elliptical hypothetical statements”. Quine is also quite 
right to point out that this ‘‘verbal tour de force” is equally applicable to 
sociology, Greek mythology, or any other empirical theory, since empirical 
theories too can be axiomatised. Of course, no If-thenist does apply the 
Tf-thenist manoeuvre in such cases: even the most ambitious logicist 
baulks at assimilating sociology or economics or physics or Greek mythology 
to logic. Hence he must have an independent reason for treating mathe- 
matics differently. And the logical empiricists did, of course, have such 





informal mathematics is creative (though axiomatising a portion of mathematics is 
creative too, as is discovering and proving a néw theorem in an axiomatic system). 
Of course (he might continue) the stratagems of informal mathematics are an important 
and fascinating field, as Lakatos has himself shown. Of course (he might add) there is a 
great difference between a more or less informal axiomatic system and a fully formalised 
one, and we are far from demanding that mathematics only be conducted in the latter. 
But (he might conclude) none of these concessions alters my central thesis: that mathe- 
matical assertions, properly cohstrued, are all conditional in form and, if true, logically 
true. 
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a reason: their central dogma that there are empirical assertions and 
logical assertions, but nothing else. Theories of sociology or economics 
or physics or Greek mythology (even if axiomatised) fall into the first 
category: hence they ate not given the If-thénist treatment.1 But theories 
of mathematics (pyre mathematics, not applied mathematics) do not: 
hence they are given the If-thenist treatment. What does the philosophical 
work here is the logical/empirical dichotomy, not If-thenism itself. This 
is simply the philosophical weakness of If-thenism, to which I already 
drew attention. 

More to the point now, however, is Quine’s own view of the “recalcitrant 
mathematical truths”. He says that ‘T (sphere, includes)’ “remains as a 
presumably true statement within some body of doctrine”. But this, it * 
seems to me, is to fall back into If-thenism without noticing it. After all, 
a truth of sociology is simply true: it is not “true within some body of 
doctrine”. What on earth does “true within some body of geometrical 
doctrine” mean if not “deducible from axioms characterising that body of 
doctrine”? Moreover, it is especially unfortunate that Quine chose a 

. geometrical example. For there are alternative geometries, and Quine’s 
“T (sphere, includes)’ could well be “true within Euclidean geometry” and 
“false within non-Euclidean geometry” (which is to say, of course, that it is 
deducible from Euclidean axioms, while its negation is deducible from 
non-Euclidean axioms). Anyone who says it is true simpliciter must, it 
seems, be making an empirical claim. 

Moreover, what holds for geometry also holds for large portions of 
modern mathematics, which are concerned with investigating various 
axiom systems. In abstract algebra, topology, and set theory itself, mathe- 
matical results do seem to be conditional in form. Of course, this may not 
always be apparent from their formulation: “Division is unique in any 
field” (that is, “If F is a field, then division is unique in F”); ““Comple- 
ments are unique in distributive lattices” (that is, “If L is a distributive 
lattice, then complements are unique in L”); and so on. In all these cases 
we may claim that some group of axioms is true of some empirical subject- 
matter (after having given an empirical interpretation to the non-logical 
terms). But assertions like “Physical space is non-Euclidean” or “The 
1 I am not so sure about Greek mythology. For the ancient Greeks it was, presumably, 

a factual theory, and at least some of it was accepted as true, But for us (who call it 
‘mythology’ and not ‘theology’) it is factually false. Yet we still want to say sych things 
as “It is a truth of Greek mythology that Apollo was the son of Zeus and Leto”. And 
the only way to make sense of such claims is to adopt an If-thenist construal of them 
(something like “If this-and-this basic assertion of the Greek myths holds, then Apollo 
was the son of Zeus and Leto’’). Or do we want to say that Greek gods really do exist 
somewhere (though not, of course, in the real world), and that this assertion is true of 
them. Thus I think that we do construe Greek mythology‘in an If-thenist fashion, though 


the ancient Greeks presumably did not. (I think the problem of ‘truth in fiction’ or 
‘fictional truth’ may be soluble along If-thenist lines—but that is another story.) 
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quantum-mechanical observables form a non-distributive lattice” belong, 
not to pure mathematics, but to empirical science. [f-thenism challenges 
us to explain what we mean by saying that a mathematical axioni or 
theorem is true when this does not mean that it is ån empirical truth. 

I can think of one way.to meet this challenge (though it is not one 
which would appeal to Quine). And that is to pestulate, alongside the 
empirical realm, a realm of mathematical entities. A mathematical ‘axiom’ 
or ‘theorem’ can be true simpliciter because there is a realm of entities 
for it to be true of, quite independently of its deducibility or otherwise 
from other mathematical statements. (The resistance of mathematicians 
to If-thenism, if it exists, might well stem from their tacit adoption of a 
“view like this one. And Lakatos’s account of informal mathematics also 
seems to rely on some such view.) 

But this kind of naive Platonism has many problems, not least of which 
is that posed by the existence of alternative mathematical theories. Are 
we to claim that actual ‘mathematical space’ (not physical space, but the 
space in our Platonic realm) is really Euclidean (or non-Euclidean), so 
that “The angle sum of a triangle is 180°” is really an unconditional mathe- - 
matical truth (or falsehood)? Are we to claim that all lattices are (are not) 
distributive? Or that the ‘universe of sets’ in our Platonic realm does 
(does not) bear out the Continuum Hypothesis? I cannot make much 
sense of such claims. Nor, I suspect, could most mathematicians who 
would regard the investigation of Euclidean and non-Euclidean geometry, 
distributive and non-distributive lattices, Cantorian and non-Cantorian 
set theory, as all being equally legitimate from a mathematical point of 
view. Yet without such naive Platonistic claims we seem to be driven back 
to the If-thenist position.+ 


1] think, for example, that the sophisticated evolutionary Platonism of Popper need 
not trouble an If-thenist. Popper tries to combine a Platonistic view of the objectivity 
of human knowledge with the Darwinian view that human knowledge is an evolutionary 
product, Thus he insists that the natural numbers are a human creation (part and parcel 
of the creation of descriptive languages with devices for counting things), but that once 
created they become autonomous so that objective discoveries can be made about them 
and their properties are not at the mercy of human whim (see Popper [1972], pp. 
158-61). An If-thenist could agree with much of this. We create, first of all, languages 
in which to express certain empirical claims: “Two apples placed in the same bowl as 
two other apples give you four apples”; “Two drops of water placed together give you 
one bigger drop of water’’; etc. Then we come to treat numbers and their addition in a 
more abstract way (so that the second statement just given does not count as an empirical 
refutation of “1-+-1 = 2”). This is, at bottom, to create a more or less explicit collection 
of ‘axioms’ for the natural number sequence. And then we find that, once these are 
granted, we must also grant other statements ¢bout numbers like “There are infinitely 
many prime numbers”. We discover, in other words, that our axioms have certain un- 
intended logical consequences. The objectivity of mathematics is guaranteed by the 
fact that what follows from what is an objective question, and we need not postulate a 
realm of ‘abstract mathematical entities’ fb ensure it. 
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However, we have yet to come to grips with some results which, it might 
be maintained, not only drive a further nail into the coffin of old-style 
logicism but also destroy its offspring If-thenism: Gödels incompleteness 
theorems. Gödel showed that for any consistent (recursive) axiomatisation 
of arithmetic there is an arithmetical statement such that neither it nor 
its negation is formally provable from the axioms. And he showed that for 
any consistent (recursive) axiomatisation of arithmetic the statement that 
it is consistent cannot be formally proved within the system but must 
rely on methods stronger than those of arithmetic itself. How do these 
results bear upon old-style logicism and If-thenism? 

There is no doubt that the early logicists thought that all arithmetical 
truths might be formally provable from arithmetical axioms (which were ' 
in turn to be formally proved from logical axioms). There is no doubt, in 
other words, that the phrase ‘deduced from’ in theses (B) and (C) of early 
logicism meant ‘formally proved from’, and that the early logicists 
identified arithmetical truth with provability from arithmetical axioms. 
Hence Gédel’s first incompleteness theorem destroys claim (C) of early 
‘logicism just as effectively as the discovery of the paradoxes destroyed 
claim (B). 

Neither is there any doubt, it seems to me, that the If-thenists carried 
over this logicist claim about formal provability into their thesis (G). 
They claimed, in other words, that all arithmetical truths could be formally 
proved from arithmetical axioms, so that the conditionals linking the two 
could be formally proved from logical axioms alone. Hence If-thenism, 
while it was not refuted by the earlier discovery of the paradoxes, seems 
to have been refuted at about the time it was proposed by Gédel’s first 
incompleteness theorem. Has the If-thenist any effective reply to this 
criticism? 

He might first try to stick to his guns, bringing Gédel’s completeness 
theorem for first-order logic to his aid. This theorem shows that everything 
that follows from or is a logical consequence of any first-order axiomatic 
system can be formally proved from those axioms. If we identify logic 
with first-order logic, and mathematics with the collection of first-order 
theories, then we can continue to maintain the If-thenist position. A 
mathematical statement becomes, via claim (F), a conditional statement 
with a conjunction of first-order mathematical axioms as antecedent and 
a first-order mathematical theorem as consequent. And all the logically 
true conditionals of this sort will be formally provable from logical axioms 
alone. Mathematics, in so far as it is adequately formalised in first-order 
logic, can be identified with logic after all. : 

It will be objected, however, that Gédel’s incompleteness theorems show 
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precisely that first-order logic is not adequate for arithmetic (or for any 
mathematical theory strong enough to contain arithmetic). For consider 
the conditional statements ‘P > G’ and ‘P > —G’, where ‘P’ stands for 
the first-order axioms of arithmetic and ‘G’ for an undecidable Gédelian 
sentence. Or consider. thes conditional statement ‘Pe> C’, where ‘C is 
the arithmetical statement expressing the consistency of first-order 
arithmetic. None of these statements is first-order provable, and none of 
them is a first-order logical truth. Hence there are interpretations of the 
first-order axioms in which they are all true and G false (as well as in- 
terpretations in which all axioms are true and G true). And hence there are 
non-standard models of first-order arithmetic in which an ‘arithmetical 
` truth’ (namely G) is false, so that these axioms are inadequate. Mathe- 
maticians think that the conditional statements ‘P > G’ and ‘P > C’ are 
true, because they think that their consequents’ are true. Hence there are 
conditional mathematical assertions which are not first-order logical 
truths. Thus ‘first-order If-thenism’ (as we might call it) collapses. 

A second possible way out for the If-thenist is to renounce the claim 
that all the true conditionals of mathematics are formally provable, while ` 
continuing to maintain that they are logical truths.’ This is to renounce 
the claim that first-order logic exhausts logic (since by the completeness 
theorem all first-order logical truths are provable). It enables the If- 
thenist to continue to maintain that all mathematical statements are 
conditional in form, and that the true ones are logically true. In particular, 
the conditional statements ‘P* > G? and ‘P* > C (where ‘P® denotes 
the second-order axioms for arithmetic) continue to be logical truths. All 
that Gédel’s incompleteness theorem shows, on this view, is that.they 
are not logical truths which are provable from the axioms of second-order 
logic. It shows, in effect, that there are no axioms for second-order logic 
from which all the logical truths of second-order logic are provable. 

This ‘post-Gédelian If-thenism’ (as we might call it) is a far cry from 
the original, whose central thesis was precisely the formal provability of 
all conditional mathematical truths. The post-Gédelian If-thenist could 


1 If I understand him rightly, this is the position Putnam defends in his [19675] and his 
[1971]. In the former Putnam contrasts two “equivalent descriptions” of mathematics, 
the first the familiar ‘“Mathematics as Set Theory”, the second the unfamiliar ‘‘Mathe- 
matics as Modal Logic”. Concerning the latter, Putnam claims that the mathematical 
content pf a proof that Fermat’s last theorem is false is expressible by a conditional 
scheme of modal logic of the form “Necessarily (A > — F)”. But the reference to modal 
logic here seems to be a red-herring, since Putnam says that he is using necessity as 
being equivalent to logical validity (Putnam [19675], pp. 9-11). I ahould also mention 
here Putnam’s earlier [1967a], in which he coined the term ‘If-thenism’. Incidentally, 
Putnam claims there that Russell subscribed to ‘If-thenism’ before he subscribed to 
logicism proper; as I see it, Russell never clearly distinguished between the two at all 
(see Putnam [1967a], p. 281). F 
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still insist, however, that any concrete mathematical-result is not only a 
logically true conditional, but a provable one. In particular, although 
Gédel showed that the consistency of arithmetic is not provable by methods 
formalisable within arithmetic itself (so that ‘P* > C’ is not a provable 
logical truth), Gentzen has proved the consistency of arithmetic from 
assumptions which transcend arithmetic. This result (like all others) is 
a provable logical truth ‘P** > C, where ‘P**’ denotes the assumptions 
of Gentzen’s proof. An objection to this is that mathematicians believe 
C to be true unconditionally, and not because it is provable from P** 
(which contains assumptions more dubious than C itself).1 But any correct 
proof proceeds from assumptions which ought to be more dubious than 
the conclusion, since the conclusion is contained in them. The point of ° 
view which underlies this objection would, therefore, if generalised, under- 
mine the whole notion of ‘mathematical proof, and the axiomatic method 
with it. An If-thenist might well retort that he is interested in the status 
of mathematical results, and not in the strengths of the beliefs of mathe- 
maticians. And he might well reaffirm his conviction that all mathematical 
truths are logically true conditional statements. 

This version of If-thenism stands or falls, it seems to me, on whether 
we are prepared to extend the title ‘logic’ to higher-order logic, and to 
countenance thereby logically true conditionals which are not provable. 
If we are, then If-thenism provides a way to assimilate mathematical 
truth to logical truth, without making the implausible claim that the various 
existential axioms of set theory or the various geometrical axioms are 
logical truths. But if we are not, if we are persuaded, for example, by the 
argument given on page 104, then we will have to admit that the question 
of the epistemological status of mathematics remains open. 
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Semantical Questions in Carnap’s 
Inductive Logic : 


by HOWARD SMOKLER 


Rudolf Carnap’s last writings on the theory of inductive logic are now in 
the process of publication. Volume I of Studies in Inductive Logic and 
Probability was issued over four years ago; Volume II is close to being 
` published. It is time to consider once again (as Wesley C. Salmon did in a 
series of articles in the 1960s) some of the issues raised by this protean 
philosopher of induction. One could argue that consideration should await 
the publication of Volume II. But I think there are enough themes of 
importance in Volume I to warrant inspection and criticism. . 

Much of the work in Volume I is a recapitulation of themes which were 
present in embryo in the Logical Foundations of Probability and which were ` 
developed by Carnap in the post-Foundations period from 1950 until his 
death. Most of these results were published in a number of papers. In 
addition to these papers, Carnap provided to a circle of students and 
associates a great deal of mimeographed materials. The article, ‘A Basic 
System of Inductive Logic, Part I’, is an editing of part of this material. 
Part II of the ‘Basic System’ will be published in Volume II. The latter 
contains material on analogical inference which, in my opinion, represents 
a new departure in Carnap’s thinking. . 

Salmon has investigated some of the developments in Carnap’s inductive 
logic, developments which had already become apparent by the middle 
sixties. But there are themes in Carnap’s work to which Salmon did not 
pay enough attention. 

I will not presume to offer an extended critique of Carnap’s last system. 
Instead, I shall take my text from a remark which Salmon makes in one 
of his papers: 

Carnap’s, chief contribution consists, I believe, in bringing the precision of 
formal logic and semantics to the field of inductive logic.+ 


I shall, in this paper, concentrate on a discussion of Carnap’s last work; in 
particular, I will discuss the interaction of semantical doctrine with 
inductive logic. 

There are two themes which run through Carnap’s post-Logical 


Received 15 January 1976 . 
1 Salmon [1967], p. 726. 
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Foundations-work and which are articulated by him in Studies. Together 
they have a bearing on his semantical doctrine. 

(a) Inductive logic i8 a discipline which afises in the context of human 
action. Superimpos¢gd upon a descriptive thepry of decision-making is a 
normative theory of the same domain. The normative theory specifies the 
conditions for rational action. As belief is a necessary component of 
decision-making, so rational belief is a necessary condition of rational 
decision-making. Certain conditions placed upon bets (acts) insure that 
the odds given on the occurrence of an event (or set of events) provide a 
probability measure on the set of events. Further conditions placed upon 
bets insure that the probability measure never assigns the value 1 or o to* 
the probability of an event except if the event is the necessary or impossible 
event. Carnap calls the measure of rational belief for a person X at a time 
T, that person’s credence function Cr,,, (H), for short Cr, (H). He argues 
further that rational change in belief over time should be a function only 
of the credence function at the beginning of that time and of the evidence 
- that might be accumulated during the period after the initial assignment of 
belief. In other words, he postulates a permanent disposition to believe, 
which he calls a credibility function Cred (H). For this function (but not for 
the credence function) Carnap postulated axioms of symmetry. These state 
that the credibility of two propositions, H and H’, are equal if the only 
difference between them is that the individual in H is replaced by another 
individual. 

These axioms or principles of rational belief are stated in quasi- 
psychological terms. An inductive logic is a set of axioms corresponding 
to the axioms or principles of rational belief but stated in a purely logical 
(set-theoretic) way. However, inductive logic is not logical in the sense 
that its set of axioms are derivable by purely logical principles from some 
set of commonly accepted logical axioms. There is a difficulty here. Any 
axiomatized discipline stated in set-theoretic terms, for example, physics, 
is logical in the same sense as is inductive logic. 

However, one thing is clear. Belief contexts are intensional; contexts 
within the scope of a belief operator contain referential and extensional 
positions which are opaque. As a consequence, the language in which belief 
relationships are expressed is intensional, as is by extension the language 
in which probability relationships are expressed. But as Quine’s discussion 
in “Three Grades of Modal Involvement’ shows, intensional languages 
which have no place for quantification into the intensional contexts can be 
given meta-linguistic paraphrases in which the quoted portion of the 
expression is formulated in an object language which is extensional. 
Carnap’s strategy is the same as Quine’s and rests upon the assumption 
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that there are no sigriificant uses of the probability concept which involve 
quantification into probability contexts. 

(b) Carnap formulated the theory of inductive „probability in such a 
manner as to make it correspond to what he considered to be mathematical 
and statistical practice. Instead of taking as arguments sentences expressing 
propositions, he takes the propositions themselves. ° 


. Instead of saying that any particular system of inductive logic applies to a 
certain language, we might just as well say that it applies to a certain conceptual 
system, namely a universe of objects and a system of descriptive concepts that 
characterize the objects; and further a class of propositions of various forms, 
involving these individuals.? 

"These propositions are ‘represented in an extensional way’ (Carnap and 
Jeffrey [1971], p. 35) or, in other words, as extensional entities. This 
representation is at variance with Frege’s semaitical doctrine of treating 
propositions as the intensions of certain terms, but Carnap believes that 
he has good reason for such an identification. 

Furthermore, Carnap identifies propositions with events. Carnap of 
course is correct in stating that the subject-matter of the mathematical ` 
theory of probability is usually taken to be events, although as that doctrine 
is formulated by W. Feller, they are taken as the subject-matter only of the 
statistical theory of probability. Feller says, 

In a rough way, we may characterize this concept [statistical probability] by 
saying that our probabilities do not refer to judgments but to possible outcomes 
of a conceptual experiment (p. 4, 2nd ed.). We have to agree on what we mean 
by possible results of the experiments or observations in question (p. 7). For 
uniform terminology, the results of experiments or discussions will be called 
events (p. 8). If we want to speak about ‘experiments’ or ‘observations’ in a 
theoretical way and without ambiguity, we must first agree on the simple events 
representing the thinkable outcomes; they define the idealized experiment.* 


Feller goes on to say that judgments are the subject-matter of inductive 
probabilities, and that these probabilities are not suitable for mathematical 
treatment. Formally speaking, events are sets. 

Except for this last proposition, Carnap disagrees with Feller. His own 
view is that events are the subject-matter not only of statistical probabilities 
but also of inductive probabilities. For propositions, not judgments, are the 
subject-matter of inductive probabilities. However, corresponding to each 
sentence in a language (judgment) is a proposition. Events are identified 
with propositions and so the subject-matter of inductive and statistical 
probabilities is the same entities. If this identification is made, then 
Carnap has achieved a unification of the subject-matter of probability. 


1 Carnap and Jeffrey [1971], p. 47. 1 Feller [1957], p. 9. 
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Carnap has several motivations for taking propositions as the arguments 
of probability functions. He says explicitly that he wishes to ‘formulate 
inductive logic in terms that had come to be standard in mathematical 
probability theory and theoretical statistics where probabilities are 
attributed to “events” (or “propositions”)?..’. Furthermore, such a 
construction overconies the serious problem of language-dependence 
present in The Logical Foundations of Probability. (However, Salmon has 
shown that there are analogous problems of variance at the objectual level.) 
And there is a technical reason for the selection of propositions: for in the 
case of complex conceptual systems, e.g. those involving real-valued 
functions, no language can express all possible propositions (or events). 

Obviously a language is required in which to represent propositions. 
That language is, for Carnap, a non-modal extensional one, essentially that 
of set theory. Since propositions are the meanings of sentences, the set of 
propositions represented in this set-theoretical language can also be 
characterised by reference to a logical language. In the case of the “Basic 
System’, the language ¥ is a first-order one with identity, a countable 
` number of individual constants, and a countable number of monadic 
predicate-terms, organised in a finite set of family terms. These linguistic 
categories have a correspondence with the elements in a conceptual 
system which was mentioned earlier in this paper. 

£ packages the information carried by the set-theoretic language, but 
it is in a different idiom. In fact, this idiom reveals the complexity of the 
information allowed in an inductive system more perspicaciously than does 
the set-theoretic language. So we shall speak henceforth of the language 
£ and its semantic characteristics. 

Let us for a moment consider some of the characteristics of 2: 

(1) ¥ is an extensional language and it is non-modal. However, if @ 
were added to the vocabulary of ¥, the language would be intensional, 
since probability contexts are intensional. However, since Carnap in effect 
treats this intensionality as second-grade modal involvement, he treats € 
as a meta-linguistic term. 

(2) Corresponding to each sentence of ¥ is a proposition, but the 
converse is not true. This fact establishes a relationship between the first 
order and set-theoretic MEER to which I made reference earlier in 
this paper. 

(3) All individual constants daii distinct objects, i.e. if a; and a, are 
two individual constants in the vocabulary of X, a; 3; a; is L-true. So 
there are no true identity statements in the language other than of the 
form a; = a, and these statements presumably are L-true. 

(4) The acceptance of the axiom*of symmetry robs the individual 
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constants of any informational value. To quote Paul Teller’s review of 
Studies on this point: 
In short the axiom of symmetry can be secured only by building into it the 


interpretation of the constants. The trouble is that this interpretation [of the 
constants as mere labels] renders the constants of Y functionless.4 


In effect no singular descriptions are permitted in the language.? 
(5) To each kind of expression in an intension and an extension are 
assignable (following Carnap’s semantic doctrine in Meaning and Necessity). 


gL 
Extension Intension 
Sentence (S) Truth-value of S « Proposition (that S) 
[TV (S)] 
Predicate-term (F%) Model (2*) Interpretation of 
FR 
Individual constant (a;) Individuals ? 


What Carnap now does, for the purposes of inductive logic, is to identify 
the intension of a sentence S (that S} with the truth-set of S: 


When we speak about ¢ (HJE) instead of saying ‘the truth set § of the 
evidence proposition’ we shall simply say ‘the evidence proposition Æ’. Thus 
we identify propositions with truth-sets.? 

Having outlined some of the relevant semantic doctrine of Carnap, I 
want to proceed to some critical remarks about it. 

(1) Carnap identifies a proposition with the truth-set of the sentence 
which corresponds to the proposition. In my view, this identification is 
implausible, however convenient. In the following discussion I present 
arguments against this identification, at least in the context of inductive 
logic. 

Carnap identifies propositions with truth-sets of sentences. This 
identification is plausible if one assumes that the language in which the 


1 Teller [1974], p. 23. 
2 Tt is perfectly true that given the resources of Z, definite descriptions can be defined in 
the Russellian way. However, they would seem to be functionless on two counts: 
(1) The axioms of symmetry presuppose the understanding of all singular descriptive 
terms as rhere labels. 
(2) If definite descriptions were permitted in the language while the condition for 
identity statements noted in (3), p. 132, above were to hold for those terms, then all 
singular identity statements of the form 


(LX)FX = (1.X)GX 


would be L-false. 
3 Carnap and Jeffrey [1971], p. 58. 
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propositions are expressed fulfils a version of the equivalence condition 
which states that all logically equivalent sentences of the language have the 
game meaning (expres the same proposition). Under these conditions two 
logically equivalent sentences must have the same truth-sets. But prob- 
ability languages are Janguages which incorporate a belief operator; and 
it is a well-known fact (accepted by Carnap in his comments on semantical 
theory in his ‘Replies and Systematic Exposition’); that for languages of 
this kind, called by Carnap languages of sense, the equivalence condition 
does not hold. He masks the situation in this case by treating @ as a meta- 
linguistic form and then claiming truthfully that Y is extensional.? But a 
more thorough representation of the semantic situation would reveal this. 
dodge. For this reason I think that the identification of propositions with 
truth-sets of sentences is not warranted. 

Another serious problem remains in Carnap’s ultimate system. The 
language contains no singular descriptive terms. Indeed, names have 
no function and sentences containing them should be replaced by 
. existentially quantified ones. No sentences of the form 

a, = a 
where a, and a, refer to the same individual is true or Z-true in the 
language; in fact, the sentence is not well-formed. But the contexts for 
which Carnap has designed his inductive logic—belief-contexts—are 
replete with such uses of definite descriptions. Take Quine’s well-known 
example: 

(A) Tom believes Wyman is on the beach 

(B) Tom does not believe the mayor of the town is on the beach 

although 
Wyman = the mayor of the town 
Tom does not believe Wyman = the mayor of the town. 


If definite descriptions were available, we would translate (A) and (B) as 


(A’) Cf rom (Wyman is on the beach/E 2 1/2 
(B’) Croom (the mayor of the town is on the beach/E S 1/2 


but this credence function cannot be represented in Z. Pragmatically, even 
the simple language employed by Carnap to represent the belief situation 
is woefully inadequate if it does not allow the introduction of Se 
relevant information carried in the singular terms. 
Carnap has carefully constructed his semantical system so that the 


1 Schillp [1963], pp. 889-905. 

3 In addition, Carnap wished to avoid by this stance the possibility of iterated probability 
statements. It is not as clear to the present generation of logicians as it was to Carnap 
that such iterated probabilities must be avoided. 
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problems of substitutivity of identity and of quantification into prob- 
abilistic contexts do not arise in what is clearly an intensional language. 
In Carnap’s language the vocabulary is so restricted that the intensionality 
of the language is of the second grade of modal involvement. As modal 
logicians know, the introduction of definite descriptipns into an intensional 
language prima facie makes that language one in which there is a third 
degree of modal involvement (modality de re). The rules for identity and 
constants which Carnap adopts, artificial as they are, avoid this difficulty. 
But a serious problem is implicit in his system—a collapse of modal 
distinctions similar to the one Quine has shown holds for necessity 
contexts and Føllesdal for causal contexts. 


« University of Colorado 
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2 THE PROBLEM OF INFINITY 


2.x The problem of infinity in classical mathematics. 


‘The absolute’, wrote Cantor, ‘can only be acknowledged, never apprehended, 
not even by approximation.’ ‘It is’, he maintained, ‘unincreasable and, 
therefore mathematically indeterminable. As an ‘absolute quantitative 
maximum’ it ‘exceeds all human capacity for understanding, and, in 
particular, eludes mathematical determination.’ 

In these passages, Cantor is attempting to describe what he called the 
absolute or the genuine infinite. He believed—rightly—that this notion 
was of central importance in the foundations of his mathematical theory 
of transfinite numbers. He held that the distinction between the mere 
transfinite, which could always be increased, and the unincreasable 
absolute, was a fundamental one. For him, it separated the legitimate 
province of mathematics from that of theology: the domain in which 
human reason is competent from that accessible only to the Divine. 

Whether or not one can accept Cantor’s theological explanation of this 
distinction, the distinction itself is valid, and the problem of the infinite— 
the absolute or genuine infinite—remains. It is a problem with which any 
theory of the foundations*of mathematics must deal. Frege’s failure to 
realise this was the essential cause of the disaster which befell his system. 
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I must emphasise that the problem of infinity is mot just the problem 
of how to treat the natural numbers. It is even more fundamental than 
that. It must be faced no matter how the- natural numbers are to be 
regarded. The problem of infinity is the problem of how to deal with 
the totality of mathematical objects, the problem of the global structure of 
mathematics. 

Let us consider the problem as it arises in classical mathematics. There 
it most definitely does not concern the status of the set of natural numbers. 
On the contrary, since Cantor it has been a central tenet of classical 
mathematics that the set of natural numbers is not infinite, at least not in 
the etymologically strict sense of limitless or boundless. To be sure it sounds. 
paradoxical, even contradictory, to say this. But that is only because the 
terms finite and infinite have been pre-empted by mathematicians and 
given special, technical meanings which have no direct or necessary 
connection with their proper meanings. There is no immediate reason to 
suppose, for example, that a set which is technically infinite in the sense 
. of being in one-to-one correspondence with one of its proper subsets, 
is infinite in the primary sense of being ‘subject to no limitation or external 
determination’.1 In classical set theory, every set is subject to ‘external 
determination’ in the sense of being embeddable in sets of arbitrarily 
large cardinality, and every number, whether cardinal or ordinal, is 
‘limited’ by still larger numbers of the same sort. The technical use of 
the terms finite and infinite is, indeed, especially unfortunate. Like the 
earlier use of real and imaginary it begs, at least by implication, an im- 
portant foundational question. Of course, on the traditional, pre-Cantorian 
view, any set infinite in the technical sense is also infinite in the proper 
sense. But then modern mathematics rejects this traditional view. 

In fact, if we look beyond the mere technical terminology into the 
fundamental presuppositions and attitudes underlying modern mathe- 
matics, we shall find that it is based on the idea that all sets are finite, in 
the proper sense. In practical terms this means that all of the basic 
set-theoretical operations apply indifferently to all sets, whether they are 
finite in the merely technical sense or not. Set theory is thus an egalitarian 
theory. No fundamental theoretical distinction is made between sets 
which are technically infinite and those which are not. This is not to say 
that the technical distinction between finite and infinite is unimportant: 
only that each of the basic operations of set theory—and, in particular, 
the key operation of power set formation—is logically prior to this distinction. 


1 This is the principal meaning of the term ‘infinite’ given in Webster's Third New Inter- 
national Dictionary. The same source gives the primary meaning of ‘finite’ as ‘having 
definite or definable limits’ or ‘having a limited nature or existence’. 
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Indeed, the usual technical distinction between finite and infinite sets is 
possible, in general, only by virtue of the universal applicability of the 
power set operation. For ta say of a set that it is finite in the technical sense, 
one has to quantify over itẹ power set. i 

It is this deep-seated egalitarianism in modern mathematics that is 
fundamentally at odds with the formal treatment of set theory which 
arises out of the conventional view of logic. In particular, this is why the 
notion of the proper class is inherently unstable, giving rise to a more 
commodious notion of ‘superclass’, closed under the same principles of 
generation as the ‘ordinary’ sets. This results directly from the obvious 
- violation of egalitarian principles which is involved in postulating a kind 
of collection—the proper class—to which one cannot do the sort of things 
one can do to ordinary collections. ° 

Viewed in another way, this egalitarian outlook reflects a robust con- 
fidence in the soundness of the intuitions upon which set theory and the 
axiomatic method are based. One often sees it accompanied by a kind 
of philistine impatience with foundational questions. This can sometimes . 
give rise to absurdities like the surprisingly widespread belief that an 
axiomatic definition of the concept of set is logically possible. 

But what is the foundational significance of this egalitarianism? And 
how does it relate to the problem of the absolute, of the true infinite, 
raised by Cantor? The answer is implicit in what I have already said. 
Modern classical mathematics, the mathematics of the axiomatic method, 
the mathematical structuralism championed by Bourbaki, is really a 
form of finttism. It ‘solves’ the problem of infinity by simply ignoring it. 
It remains resolutely and incorrigibly local in outlook, even when, as in 
category theory, it pretends to adopt a global point of view. This localness 
of outlook is reflected in the egalitarianism already mentioned. And, in 
general, all definitions, arguments, constructions, ete., can be construed 
as taking place inside some suitably chosen set, which constitutes a local 
universe of discourse. Of those rare occasions when a genuine global issue 
does arise—as with the question ‘Does there exist a measurable cardinal?’ 
for example—everyone feels instinctively ill at ease. And there is a visceral 
conviction that somehow such questions may not be well posed, in the 
sense of admitting of a definite answer one way or the other. Formally, 
this localness of outlook is reflected in the fact that we can get by with 
quantifiers whose domains of variation are sets, and operations of higher 
type which are continuous in their functional arguments (like the replace- 
ment and recursion operators—see section 2.2). 

The idea of treating the natuyal numbers as ‘closed’ or ‘completed’ 
(i.e. finite) totality was regarded at first as a revolutionary one. And so 
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it was. But it now serves impeccably conservative ends. For once this 
admittedly very strong assumption is made, once there is available a 
definite structure satisfying the Dedekind—Peano axioms for the natural 
numbers, then every other structure that mathematicians are interested 
in—the real numberg, in particular—can be constructed by straight- 
forward set theoretical techniques. One is therefore relieved of the effort 
of having, as it were, to finitise all of one’s structures. It can be done 
cleanly, neatly, once and for all, with the natural numbers. Thus this 
single decision to treat the natural numbers as forming a set, banishes 
all the disturbing foundational difficulties associated with the problem of 
infinity to a distance infinitely (in fact, absolutely infinitely) remote from - 
the practical concerns of the vast majority of mathematicians. 

The great simplicity of outlook to which this finitism gives rise no 
doubt explains why modern classical mathematics is so much more popular 
than any of its rivals. The classical mathematician has the immense 
advantage that he does not need constantly to concern himself with 
. foundational questions merely to do, say, ordinary real analysis or even 
arithmetic. In particular, he can (unlike an intuitionist, for example) 
simply ignore the problem of infinity—with all its attendant logical, 
ontological, epistemological, and (as anyone familiar with the literature 
of intuitionistic analysis will appreciate) terminological difficulties—and 
get on with the mathematical job at hand. And he can do this, because 
he will never have to leave, what is for classical set theory at any rate, 
the realm of the finite. 

The finitism of classical mathematics is, of course, not traditional 
Hilbertian finitism. I propose to call it Cantorian finitism, since Cantor 
was the first to articulate it as a clear point of view. He deserves con- 
siderable credit for having recognised the problem of infinity, and for 
having seen so clearly how his revolutionary ideas could be accommodated 
to the restraints imposed by that problem. This is an aspect of his genius 
and achievement which has not been adequately recognised. 

Cantor had substantially worked out his position on the problem of 
the absolute in the 18808, early in his development of the theory of 
transfinite numbers. This was why he was able to regard with equanimity 
the controversy over the foundations of set theory occasioned by the 
discovery of the ‘paradoxes’ near the turn of the century. Here, for 
example, is a passage taken from a letter from Cantor to A. Eulenberg, 
in which he attempts to explain his doctrine of infinity. The letter is dated 
28 February 1886: 


The transfinite, with its abundance of ferms and configurations, points the 
way, of necessity, to an absolute, to the ‘genuine infinite’ on whose magnitude 
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there can be no increase or decrease of any sort, and which is therefore to be 
regarded as a quantitative absolute maximum. The latter exceeds all human 
capacity for understanding, gnd, in particular, eludes mathematical deter- 
mination. ([1966], p. 405) ` 
Cantor is talking here, not about the theory of sets; but about his own 
theory of transfinite numbers. (These were to be obtained from sets by 
abstraction.) It is clear that he was aware that there are serious difficulties 
concerning the notion of the greatest cardinal (or ordinal) number though 
it is doubtful that he realised, this early, that these difficulties penetrated 
to the level of sets. (It was only much later that he was to speak of 
. ‘inconsistent totalities’.) His solution to the problem of infinity is suggested 
by the passage I have just quoted: in mathematics one sticks to the realm 
of the transfinite. One does not attempt to sgale the ‘genuine infinite’ 
which exceeds the power of mere human reason to comprehend it. 

Cantor’s solution to the problem of infinity works. It has become the 
current orthodoxy, if only by default. The localness of outlook which is 
the essential core of Cantorian finitism, arises quite naturally out of . 
simple assumptions concerning the homogeneity of the realm of sets, 
and the universal applicability of set theoretical constructions. And to 
produce their effect, these assumptions need not be explicit at all. Thus 
the mathematician to whom the idea does not even occur that his methods 
might have only a restricted application, or that there might be mathe- 
matical structures to which ordinary mathematical procedures do not 
apply, must be classified as a Cantorian finitist whatever his professed 
opinions may be. 

So far I have done little more than simply to call attention td the 
Cantorian finitist point of view. I have not given any details concerning 
its mathematical formulation. To solve the technical problems posed by 
Cantor’s general outlook—and, in particular, by his doctrine of the absolute 
infinity of the realm of sets—I shall have to introduce some important 
new notions into classical set theory (although, I maintain, they have 
really been there all the time, in one form or another). In particular, I 
shall show how Frege’s notion of function together with Brouwer’s idea 
of continuity can be used to clarify Cantor’s ideas. 


2.2 Functionals and continuity in classical set theory 

Classical set theory is a general system of mathematical logic. It en- 
compasses the entire territory in which classical predicate logic and classical 
methods of set construction hold sway. I have already given a brief 
description of this theory"in sectjon 1.7, and a thorough treatment of 
it in my article [1977a] (also see the Appendix). But in both of these 
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accounts I have concentrated entirely on logical issues. My main 
aim. was to sort out the connections between classical predicate logic 
and general set theory, and this involved me in the primary task of 
reformulating the semantics of classical logig. Now I want to take a 
different approach. I want to look at classical set theory as the embodiment 
of the Cantorian-finitist point of view. I want, therefore, to consider it 
in the light of what I have called the problem of infinity. 

This change in perspective necessarily leads to a somewhat different 
view of the basic principles of the theory. In particular, it becomes more 
natural to consider the principles of comprehension, replacement, and 
quantification in conjunction with the principle of definition by transfinite - 
recursion. This would not be appropriate in a logic-oriented approach, 
since unlike the other three principles, the principle of definition by 
transfinite recursion goes beyond elementary or first order methods. But 
all of these principles involve the key idea of continuity. And it is this 
idea that lies at the heart of the Cantorian-finitist solution to the problem 
- of infinity. 

Let’s look at classical set theory from this new point of view, then. 
I have clearly shown how simple and practical considerations of logic 
force us to accept the analysis of the concept of set given by Zermelo and 
Mirmanoff. Their analysis yields the most general notion of extensional 
collection conceivable. Sets must therefore be regarded as occupying the 
successive levels of a cumulative hierarchy of levels. The sets occupying 
a given level are those whose members first occur at lower levels. In 
particular, at the lowest level occur all those objects which have no 
members. These comprise the empty set together with all objects which 
are not sets. The latter are traditionally called urelements. It is not the 
business of general set theory to say anything whatsoever about them. 
We must assume, of course, that they are well-defined and properly 
individuated, so that the usual classical laws of identity hold true of them. 
We should not, however, assume that these urelements can all be gathered 
together to form a single, well defined set. And since the levels of the 
hierarchy are cumulative, this means that we should not assume that the 
objects occurring on any given level can be formed into such a set. 

Sets and urelements together comprise the ontological category of 
objects. The other basic category of the theory is that of functionals. 
Functionals are to be sharply distinguished from objects. They are not 
members of the cumulative hierarchy, but maps defined at each object 
of the hierarchy having objects of the hierachy as values. (Functionals 
of two or more arguments are also pogsible, of course.) 

What I am calling ‘functionals’ here are what Frege called ‘functions’. 


On the Consistency Problem for Set Theory 143 


I have adopted a different teminology in deference to current usage. For 
what are called functions nowadays are entities of a sort Frege would 
have called objects. They correspond more to his value-ranges (Werth- 
verldufe).1 

I shall reserve the term function for locally defined maps, considered 
as sets of ordered pairs. Functions are thus identified, in my terminology, 
with their graphs. They are therefore objects—members of the cumulative 
hierarchy. It follows that they have clear-cut, extensional identity con- 
ditions. They can differ only by having different domains of definition, 
or by taking different values for at least one argument in a common 
_ domain of definition. Functionals, in strong contrast to this, have no 
identity conditions whatsoever—clear-cut or otherwise. There is simply 
no way of asserting the identity of functionals in classical set theory. 
(This is an important point to which I shall return later.) 

It is obviously a fundamental task of the theory to explain why, and in 
what sense, functionals differ from objects. In particular, their exclusion 
from the ranks of the urelements must be accounted for. As we shall see, 
the attempt to carry out this task will lead us directly to confront the 
problem of infinity in classical mathematics, and to formulate, more 
precisely than before, the Cantorian-finitist solution to that problem. 

The central importance of the concept of functional has hitherto been 
obscured by the conventional formalisation of set theory as a traditional 
first order theory. In that formalisation, the role properly assigned to 
functionals has been usurped by proper classes. Indeed, it is partly in 
order to fill the gap left by the failure to include functionals among the 
primitive apparatus of the theory, that proper classes must be introduced. 
But since proper classes are nothing but overblown sets, they are entirely 
unsuited for this role. Here we are back to the point I made in section 1.2. 
There is no way out of the cumulative hierarchy—the realm of objects—by 
simply making our collections larger and larger. 

For Frege, functionals differ essentially from objects by virtue of being 
‘incomplete’ or ‘unsaturated’. This is the feature of functionals cor- 
responding to the circumstance that functional signs require completion 
by the insertion of names for objects in their argument places before 
they can occur as parts of meaningful terms or sentences. It seems that 
Frege himself actually arrived at the idea of a functional by thinking 
about the way in which functional signs are customarily used. 


1 Recall that for Frege the value-range of a functional f is something like the set of all 
ordered pairs (x,y) such that f(x) = y. His value ranges are therefore objects. 'The purpose 
of a value-range is to condense into a single object all the information concerning the 
values of the corresponding functional for all arguments. 
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The sign for a functional is unsaturated; it needs to be confpleted with a numeral 
which we then call the argument sign. --- The pecularity of functional signs, 
which we have called ‘unsaturatedness’, naturally has something answering to 
it in the functionals themselves ([1904]). `, 


It would be diffieult to exaggerate the importance of the distinction 
between functionals and objects for Frege’s general philosophical position. 
What I want to do here, is to explore some of the consequences of this 
distinction for classical set theory, and, in particular, to establish its 
connection with the Cantorian problem of the absolute infinity of the 
universe of set, #.e. in Fregean terms, of the absolute infinity of the realm. 
of objects. 

Let me begin with a question. Is the Fregean notion of functional an i 
extensional one? It is certainly true that Frege himself had an extensional 
view of functionals in thé sense that he did not distinguish functionals 
having the same values for the same arguments. But this is by no means 
the whole story. For we must not forget that Frege was either unaware 
of, or simply ignored, what I have called the problem of infinity. And this, 

“as we shall see, has a direct, even a decisive, bearing on the question at 
issue. 

Frege’s extensional view of functionals is forced upon him by two of 
his central ideas: his theory of value-ranges and his theory of quantification. 
Consider the theory of value-ranges, for example. By forming the value 
range of a functional, we condense into a single object all the information 
about the values of that functional for all possible arguments. Thus the 
second level functional which assigns to each first level functional its 
value-range uniformly localises or finitises the global or infinite properties 
of those functionals. Obviously the same applies to the second level 
functional giving universal quantification. It is the presence of these two 
second level functionals in his theory that lumbers Frege with an ex- 
tensional view of functionals. Thus in his review of Husserl’s Philosophie 
der Arithmetik, he wrote: ‘. . . coincidence in extension [value-range] is a 
necessary and sufficient condition for the occurrence between concepts 
of the relation corresponding to identity between objects.’ And he added, 
in a footnote to this passage: ‘Identity, properly speaking, does not apply 
to concepts’ ([{1894]). This last remark is important. Strictly speaking, 
on Frege’s view there is simply no way of asserting the equality of two 
functionals. The closest we can come, on Frege’s analysis, to asserting 
the identity of two functionals is asserting the identity of their value 
ranges. And, of course, it is not possible simply to stipulate 


Uf = 8] Sas lfe) = egle)] 
since the proposed definiendum is unsaturated, whereas the definiens is not. 
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But if Frege’s doctrine of value-ranges is accepted, there is a temptation 
to think, impatiently, that it is only a logical quibble which stands in 
the way of a fully-fledged extensional view of functionals. i 

This, however, is a temptation which Frege himself successfully resisted. 
Indeed, his insistence on ‘the importance of the functional-object dis- 
tinction is all the more remarkable given his doctrine of value-ranges. 
For in the presence of value-ranges the distinction has no ‘practical’ 
value. In fact, from a strictly workaday point of view, it is merely a source 
of unnecessary complication. After all, the sole purpose of introducing 
the value-ranges was to get round the difficulties caused by his insisting 
upon this distinction in the first place. (One is reminded here of the case 

of Russell and the Axiom of Reducibility.) And not only does the necessity 

for introducing value-ranges give rise to considerable technical complica- ` 
tions (see volume I, sections 9 and 10 of the Grundgesetze, for example), 
but Frege himself felt constrained to express reservations about their 
introduction into his system, even before it was pointed out to him that 
his Basic Law V led to a contradiction (see p. vii of the Grundgesetze). 

On the other hand, once we reject Frege’s theory of value-ranges, and ` 
with it his theory of unrestricted quantification, then the extensional view 
of functionals no longer forces itself upon us. Should functionals be 
regarded intenstonally then? I would gladly accept such a proposal, pro- 
vided it was clearly stipulated that they are not to be regarded as intensional 
objects. However, I do not believe that this is the direction in which we 
should search for a deeper understanding of the functional-object dis- 
tinction. There is too great a danger that we might be tempted to settle 
for a merely verbal solution to what is, after all, a real difficulty. (‘There 
is certainly a strong precedent for this in the conventional treatment of 
the supposed distinction between sets and proper classes.) 

In any event, it is clear that we cannot remain satisfied with an explana- 
tion of the functional-object distinction based on Frege’s metaphor of 
‘saturation’. That is an extremely slender thread on which to hang our 
whole ontology. Besides, Frege’s explanation applies only to the least 
problematic use of functional signs—I mean their straightforward use in 
building up complicated terms denoting objects. The real problems occur 
when we have to consider functionals of higher level taking ordinary 
‘first level’ functionals as arguments. And here is where it becomes 
necessary to leave Frege’s own, inadequate, formal theory behind, and 
to consider his fundamental ideas in the context of present day set theory. 

Let’s look at the functional-object distinction as it occurs in modern 
set theory, then. In the most elementary part of the theory, one deals only 
with particular functionals (uniorf, intersection, difference, etc.) so that 
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the general notion of functional has no role to play. In this part of the theory, 
functional symbols are used syncategorematically. This means that the 
functionals themselves appear only through their values for single argu- 
ments. But not very much can be done in this way. In order to get past 
this very rudimentary part of the theory, we mfist move towards employing 
functional signs categbrematically, as complete symbols occurring in the 
argument places of signs of higher syntactical level. I speak of ‘moving 
towards’ here, because the all important question is: how far do we have 
to move in this direction of reifying—in Fregean terminology, objectifying 
—functionals. 

Formally, this question arises as soon as we try to introduce quantifica- 
tion into our theory, or to allow definition by comprehension, replacement, ` 
or recursion. The easiest way to formulate these principles is to use higher 
level functionals, taking “both functionals of lowest level and sets as 
arguments. In order to do this for quantification and definition by 
comprehension, however, we must adopt Frege’s view of concepts as 

functionals having truth values as values. But once the appropriate formal 
` arrangements are made, we can reformulate our fundamental question in 
the following way: given that ordinary (first level) functionals are to appear 
as arguments for functionals of higher level, how are we to avoid treating 
them as objects? What, indeed, is the real meaning of this distinction 
between functionals and objects? And how are we to avoid Frege’s 
fundamental error of disregarding the all-important distinctions between 
the finite and the infinite—taking those terms in their primary senses? 

The answers to these questions lie in the fundamental notion of 
continuity, the most important notion in the foundations of set theory. The 
basic idea here is a simple one. We must insist that whenever higher 
level functionals are introduced, their values cannot depend on all the 
values of their functional arguments. The values of such functionals must, 
on the contrary, be determined completely by the values of their argument 
functionals restricted to certain sets. They must, in short, be continuous 
in their functional arguments. 

This point is most easily illustrated by the quantifiers. If we define 
functionals Q, and Qa by 

O1(9,4) = (We a) p(x) 
O2(,4) = (Axe a) p(x) 
then they are easily seen to be continuous in that if ¢ and yw are pro- 
positional functions agreeing on all members of the set a, then 
Q(¢,2) = Q40), 
Q2(4,4) = Dal },2) 
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Similar observations apply to the comprehension functional C: 
C($,a) =ar {x ea | $(x)} . 


and the functional, R, for replacement: k 

R(L4) =a; (f(*) | xea} ; 
In all of these cases, the critical neighbourhood giving continuity is the 
simplest possible functional of the set argument, namely, the identity 
functional. , 

With the higher level functional for definition by recursion, however, 
the situation is more complicated. It will therefore be worthwhile con- 
sidering this case in some detail. The idea behind definition by recursion 
is this: given a well-founded relation r and a unary functional, g, of lowest 
level, we want to define a functional f of lowest level such that at any 
argument a e Field(r), ° 

fla) = afte a) 
where fta is the graph of the restriction of f to the set of strict pre- 
decessors of a under the relation r. In order to give this definition uni- 
formly in g, r and a, we introduce a functional, Rec, of higher level ` 
satisfying the general conditions that 

Rec(g,r,a) =¢ 
if r is not well founded, and 

Ree(g,r,a) = g(Rec(g,r)}, a) 

otherwise. The functional Rec satisfying these conditions can be seen to 
be continuous in the functional argument g, but the continuity here is 
more complicated than in the earlier cases. For the simplest argument 
to show continuity here involves the use of recursion itself to define ‘the 
critical neighbourhood. Thus suppose that f is introduced by the recursion 
equation 


f(a) =8(ft-a) (ae Field(r)) 


F(a) = Rec(g,r,a). 
Then a critical neighbourhood for continuity is given by the set 
S = {f}, x | x  Field(r)}. 
For it is obvious that if g and g, agree on this set, t.e. if 
ay) = Bly) for -yeS 


Rec(g,r,a) = Ree(gy,7,2). 
Clearly this way of verifying the continuity of the recursion operator 
involves a considerable degree of impredicativity, not to say circularity. 
Whether a more acceptable alternative is possible, one not open to the 
K 


so that 


then we have 
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charge of impredicativity, I simply do not know. I suspect not, however. 
It is perhaps significant, in this connection, to observe that one does not 
really use definition by recursion in its full strength outside general set 
theory itself. Most such definitions encountered in mathematics have the 
following characteristic: the values of the functional being defined by . 
recursion can be confihed, in advance, to a given set. (Consider, for example, 
the definitions of Borel sets and Baire functions in real analysis.) This kind 
of definition by transfinite recursion can be carried out using only the 
replacement operator. Recursion in the strong sense described above is 
not required. The stronger sort of definition occurs, on the other hand, 
in the development of Cantor’s general theory of transfinite cardinal 
numbers, for example. i 

This concept of continuity makes it possible to give a sharper formulation 
of the Cantorian-finitist point of view than has been possible up to now. 
The fundamental principle of Cantorian finitism can be put like this: 
all operations and properties applicable to functionals must be continuous in 
their functional arguments. 

Notice that in formulating this principle we have to make essential 
use of Frege’s notion of functional. This provides a striking vindication 
of his insistence upon the fundamental character of this notion. Of course 
the modern notion of set is foreign to Frege’s point of view. But even if 
we make full allowance for this, it does not appear at all unreasonable or 
far-fetched to regard the version of Zermelo-Fraenkel set theory I have 
described as a modification and refinement of Frege’s Begriffschrift. And 
if we take this point of view, we get a new insight into the absolutely 
fundamental character of the functional-object distinction. 

The explanation for the necessity of distinguishing functionals from 
objects is not to be found in Frege’s doctrine of the unsaturatedness of 
functions.! It is rather to be found in the concept of individuation. For 
the essential defining characteristic of objects is their possession of clear 
identity conditions. (This is why there is no difficulty about classifying sets 


1 It’s not that Frege’s doctrine of the unsaturatedness of functionals is false. On the 
contrary, it is merely insufficient to account for the full scope of the functional-object 
distinction. The doctrine was designed to explain how functionals could be made to 
combine with their arguments. But there is no need to invoke a general notion of 
unsaturated functionals in order to accomplish this explanation. For it would suffice 
to postulate a single binary functional Ap(£,2) (for ‘application’) instead. One could 
then take the generality of functionals to be objects, and construe expréssions like 
‘f(a)’ as abbreviations of ‘4p(f,a)’, so that the necessary ‘gaps’ for the arguments would 
be supplied by the functional Ap(£,0). Indeed, Frege’s functional é M ¢ is just such a 
functional (although it is defined rather than primitive—see section 34 of volume I of 
the Grundgesetze), and his value-ranges are precisely the sort of functionals-as-objects 
envisaged here. This approach, however, is not tenable for the reasons which I am 
about to give. ° 
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as objects.) But functionals are not clearly individuated in this way. Indeed, 
they cannot be, for the principle of continuity implies that it is impossible 
to define a relation of identity between functionals which satisfies the usual 
classical laws of identity. If we accept the principle of continuity, we can 
- never acknowledge the existence of a second level relation ~ ~ satisfying the 
condition 
f=. >A = Ae 

where ¢ is a parameter for second level functionals, which is open for 
arbitrary substitutions. For the continuity condition ~ itself permits us to 
define given any functional f, a functional g and a condition ¢ for which 
‘the above implication is false. 

This, then, is why functionals cannot be assimilated to objects. Even if 
we could somehow get round the difficulty of functionals being un- 
saturated, there is no way to get round their essential lack of clear identity 
conditions. The principle of continuity thus allows us to see in precisely 
what sense functionals differ radically from objects. 

Incidentally, I think that this discussion brings out, quite clearly, why it . 
at first seems very difficult to come to grips with the notion of an improperly 
individuated entity. The very terminology itself looks rather suspect. And 
it suggests that what we must do is to ask what kind of entity it could be 
which could not be properly distinguished from others of its kind. The 
problem invites us to ask ourselves what such entities would be like in 
themselves. But the question we ought to ask is: what would the properties 
of such entities be like? 

The view of classical set theory that emerges out of all of this is a rather 
eclectic one. The basic notion of set itself comes from Zermelo ‘and 
Mirmanoff. The fundamental distinction between objects and functionals 
comes from Frege. But the rationale for that distinction is provided by 
Cantor’s formulation of the problem of infinity for classical set theory 
in terms of his doctrine of the absolute. Finally, the idea that the infinite 
1 Let f be given. ‘Then since we may suppose the value of f = f to be true, the continuity 

of ~ must insure the existence of a set s such that if 


(Y x e s) [A(x)= f()] 


and 


(V x € s) [e(x) ae 
then g = h is also true. Now define g by 


a) if aes 
g(a) = Pen otherwise 
Assuming, without loss of generality, that f(s) does not have exactly one element, 
we can set 


(h) = [b(s) has exactly one element] 
to obtain the required counterexample. ° 
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character of functionals should be manifested in the continuity of the 
properties and operations which can be applied to them is due to Brouwer. 


2.3 A formulation of the consistency problem for set theory. 
To anyone who has followed me this far, it should be reasonably clear 
what sort of proof a Cantorian-finitary consistency proof for conventional 
first order Zermelo—Fraenkel set theory might be. Just as in traditional 
Hilbertian proof theory, the purpose of such a proof would be to justify 
not the mathematical (in this case, set-theoretical) content of the theory, 
but its logical content—in particular, its employment of quantifiers whose 
bound variables need not be confined to range over well-defined totalities 
(i.e. sets). ’ i 

A finitist of the traditional sort does not question the legitimacy of the 
principle of mathematical induction: 

P(o) P(a) > P(a+1) 
PC) 


. where a is a parameter (subject to the usual restrictions), t is a term, and 


P denotes any well-defined property. What is at issue in consistency 
proofs is the classical, or for that matter the intuitionistic, first order 
formalisation of this principle. And what does require a finitist justification 
is the use of unbounded quantification over the natural numbers in 
defining, or, at any rate, in attempting to define, properties, P, of 
natural numbers to which the principle of mathematical induction can 
be applied. 

From the point of view of traditional finitism, the formulas by means 
of which these supposed properties are specified are, prima facte, meaning- 
less. They acquire such meaning as they do possess, in the formal theory 
of arithmetic, from the rules and axioms, both logical and arithmetical, 
which are incorporated into the formal theory itself—including the very 
instances of the induction inference schema containing the property- 
defining formulas under consideration. Those axioms and rules thus 
constitute a kind of operational definition of the meanings of the formulas 
of the system. (In any case, they clearly impose necessary conditions on 
any stipulation of those meanings.) But there is obviously a strong element 
of circularity or impredicativity in that kind of definition. And it is the 
business of traditional Hilbertian proof theory to analyse this impredica- 
tivity in order to establish that it does not lead to contradictions. 

The state of affairs in set theory is analogous to that in arithmetic. 
Cantorian finitism does not deny the validity of the principle of replace- 
ment. That principle simply says that the image of a set under a functional 
is a set. What is called into question is the first order Zermelo-Fraenkel 
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formalisation of this principle. At issue is the legitimacy of using con- 
ditions employing global quantifiers to define the functionals involved. 
Indeed, the impredicativity in the set-theoretieal case is even more 
palpable than in the case of arithmetic. For over and above the circularity 
involved in determining the meanings of the formufas from the axioms 
and rules involving those formulas, there is the gross and obvious fact 
that each instance of the replacement axiom schema is patently enlarging 
the very domain over which the quantifiers occurring in its formulation 
range. This is a feature which has no parallel in the case of arithmetic. 
Thus a careful analysis of the effect of the formal logical symbols, especially 
-the quantifiers, on the informal, intuitive mathematical notions underlying 
the formal theory is much more urgent and pressing in set theory than 
in arithmetic. . 

Of course it would be extremely difficult, and maybe even impossible, to 
spell out what would and what would not be acceptable as a means of proof 
from the Cantorian finitist point of view. But this is also notoriously the 
case with traditional Hilbertian finitism. As long as we are still trying to - 
discover a consistency proof, the absence of a general account of methods 
of proof acceptable to Cantorian finitism is not too much of an embarrass- 
ment. The obvious thing to do is to judge any proposed proof on its 
individual merits, as has been done with the various consistency proofs 
for arithmetic. The embarrassment would arise only if we began to suspect 
that no such proof could be constructed and wished somehow to establish 
that this was the case. 

It must be emphasized that all I have managed to do here is to give a 
brief formulation of the consistency problem for set theory by pointing 
out analogies with the consistency problem for arithmetic. I have dis- 
cussed, at considerable length, the point of view from which such a proof 
should be attempted, and I have suggested that the need for such a proof 
is much more pressing in set theory than in arithmetic. But I have by 
no means established this latter point. In particular, I have not eliminated 
the possibility that there may be a trivial, model theoretic consistency 
proof for full second order Zermelo—Fraenkel set theory based on prin- 
ciples inherent in the Cantorian finitist point of view, but not yet 
mentioned in this discussion. (There is no parallel for this in the arith- 
metical case.) Nor have I indicated of what use such a consistency proof 
would be. For simply to prove the traditional formulation of Zermelo— 
Fraenkel set theory formally consistent would not remove the grave 


3 There is an interesting discussion of this issue, from a different point of view from that 
presented here, in Takeuti’s recently pifblished book on proof theory ([1975], p. 130). 
A related point is made in Kreisel ([1969], p. 10x). 
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foundational defects in that formulation which I discussed at such length 
in section x. In the next section I shall take up both of these potential 
objections to my propesed search for a consistency proof and show that 
they can be overcome. 


3 NATURAL AXIOMS OF INFINITY 


3.x The relevance of natural axioms of strong infinity to the consistency 
problem for set theory. 

Now that I have formulated a programme for investigating the consistency 
of Zermelo-Fraenkel set theory analogous to Hilbert’s programme as, 
applied to arithmetic, I must consider a serious objection to this pro- 
gramme which does not have any parallel in the arithmetical case. The 
objection in question is contained implicitly in the following important 
passage taken from Gödels article ‘What is Cantor’s Continuum 
Hypothesis?’ : 


. First of all the axioms of set theory by no means form a system closed in itself, 


but, quite the contrary, the very concept of set on which they are based suggests 
their extension by new axioms which assert the existence of still further iterations 
of the operation ‘set of’. These axioms can be formulated also as propositions 
asserting the existence of very great cardinal numbers (i.e. of sets having these 
cardinal numbers). The simplest of these strong ‘axioms of infinity’ asserts 
the existence of inaccessible numbers (in the weaker or stronger sense) > Ng. The 
latter axiom, roughly speaking, means nothing else but that the totality obtain- 
able by use of the procedures of formation of sets expressed in the other axioms 
forms again a set (and therefore, a new basis for further applications of these 
procedures). Other axioms of infinity have first been formulated by P. Mahlo. 
These axioms show clearly, not only that the axiomatic system of set theory as 
used today is incomplete, but also that it can be supplemented without 
arbitrariness by new axioms which only unfold the content of the concept of 
set... 


If Gédel is correct in the view stated so clearly and forcefully here, then a 
consistency proof of the kind I have envisaged would be entirely superfluous, 
even if it proved possible to construct one. For the consistency of the 
traditional first order formulation of Zermelo-Fraenkel follows as a trivial 
consequence of the weakest of these axioms. And even more seriously, 
if Gédel’s general point of view is justified, then any precise formulation 
of axioms of this sort will immediately give rise to further such axioms, 
from which the consistency of the former axioms will follow in the same 
direct fashion. There would therefore be no hope at all of making sense 
even of a more general version of the consistency problem for set theory, 
which might attempt to take these views of Gôdel into account. 

The central issue is stated in the last paragraph of the passage I have 
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just quoted. The basic question is whether the Cantorian finitist point 
of view as formally expressed in the version of Zermelo—Fraenkel, set 
theory which I have described ‘...can be gupplemented without 
arbitrariness by new axioms which only unfold the content of the concept 
of set... Are these natural axioms of strong infiñity already implicit 
in the Cantorian finitist outlook? That is the question to which I must 
address myself. Notice, however, that I do not have to argue against the 
truth of these axioms themselves. Gödel is obviously not claiming that 
he can prove them, only that they represent plausible extensions of already 
accepted principles. All I need do, therefore, is to cast doubt on the 

- informal arguments which purport to establish their plausibility. I am 
anxious not to be misunderstood in this, because I see the attempt firmly 
to establish the plausibility of these axioms to be the main purpose behind 
the search for a consistency proof for traditional first order Zermelo— 
Fraenkel set theory. I want simply to show that such a search is not 
superfluous. 

If my reconstruction of the arguments purporting to establish the. 
plausibility of these axioms is correct, then they are all, I maintain, based 
on a subtle misconception of the logical role of the axiom of replacement. 
This mistake consists, in my opinion, in taking replacement to be a 
principle of set formation analogous, say, to the formation of the power 
set. My confidence in the correctness of this analysis is strengthened by 
the circumstance that what I regard as the true logical function of replace- 
ment is obscured by the conventional formulation of Zermelo—Fraenkel 
as a first order theory of the traditional sort. The true nature of this 
principle becomes apparent only when it is formulated in terms’ of a 
functional of higher level, in the manner I described in section 2.2. All 
of this I shall explain in greater detail in the discussion that follows. 


3.2 The case for the existence of inaccessible numbers. 


Let us begin by considering Gédel’s claim that the axiom asserting the 
existence of inaccessible numbers > Ng 

... means nothing else but that the totality obtainable by use of the procedures 
of formation of sets expressed in the other axioms forms again a set (and there- 
fore, a new basis for the further application ef these procedures.) 

A similar opinion is expressed by Zermelo in his classic paper Uber 
Grenzzahl und Mengenbereiche, cited by Gödel in a footnote to the passage 
I have just quoted. Obviously any opinion shared by both Zermelo and 
Gédel must carry considerable weight. But let’s see if we can reconstruct 
the argument here in some detail. 
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The argument rests, I believe, on the following very simple and very 
plausible assumption: given any limited number of set-formation prin- 
ciples, there is already a set which is closed with respect to all of those 
principles. I propose to call this the principle of localised closure because 
it asserts that any cltarly defined global closure property of sets is reflected 
locally, #.e. inside a particular set. 

The idea behind the localised closure principle is that the community 
of sets is so vast that no limited means of set formation can exhaust it. 
The totality of sets produced by any limited number of principles must 
itself form a set. Localised closure is thus an expression of the absolute 
infinity of the realm of sets. It therefore seems quite reasonable to suppose, 
that whatever can be clearly established on the basis of this principle 
should be accepted as true. 

But this is already enough—or so it would appear—to establish the 
existence of strongly inaccessible numbers. Consider, for example, how 
we might obtain a strongly inaccessible number greater than a given 
_ ordinal number «. Simply take the set R(a-+1) of pure sets (sets of rank 
<a built up from the empty set of urelements) and close it out under 
the principles of set formation embodied in the Zermelo—Fraenkel axioms, 
namely, the principles of power set formation, pairing, union, and replace- 
ment. The result is a set a closed under all of these basic set-formation 
principles. The ordinal number representing the rank of a is then the 
required strongly inaccessible number greater than «. 

Well, what is wrong with this argument? I am convinced that it contains 
a subtle fallacy. But the difficulty does not lie in the principle of localised 
closure. No, the fallacy consists in viewing the replacement axiom as a 
principle of set formation. A principle of set formation must, of necessity, 
impose a closure condition on the realm of sets, namely, that along with 
any set there must be the further set constructed from it by that principle. 
But the principle of replacement does not impose a closure condition 
on the realm of sets—at least, not directly. As a functional of higher 
level, the replacement functional imposes a closure condition on the realm 
of first level functionals, namely, that along with any first level functional f 
there is the further first level functional f“ (in the notation of Gédel’s 
monograph) obtained from it by replacement. 

But fully to understand why replacement cannot be regarded as just 
another set formation principle, we must look more deeply into the logical 
connection between replacement and localised closure. It is particularly 
inappropriate to regard replacement as a principle of set formation for 
the purposes of applying localised closure. For it is possible to see replace- 
ment as an alternative formulation of localised closure. More exactly, 
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a precise formulation of localised closure is derivable from replacement 
together with a single instance of recursion (involving definition by 
induction along the natural numbers); conversely, replacement is 
derivable, in return, from this precise formulation of the localised closure 
principle. The precise version of localised closure ig question can be given 
as follows: 

Given any functional f and any set a, there is a set b such that 

(i) a is a subset of b 

(ii) b is closed under f (i.e. f(x) is in b whenever w is in b) 
We can avoid the existential formulation of this principle by observing 
- that the intersection of any collection of sets containing a and closed 
with respect to f is itself a set containing a and closed with respect to f. Thus 
the content of this principle can be expressed by introducing a higher 
level functional, «Z, such that ef(f,a) is the smallest superset of a closed 
with respect to f. 

Now this does not look like a formulation of the full localised closure 


principle. In its informal version, that principle asserts the existence of. 


sets closed with respect to any ‘limited number’ of set formation principles, 
whereas the formal version I have just given only asserts the existence of 
sets closed with respect to a single set formation principle embodied in a 
given functional f. But appearances are deceptive here. It is not difficult 
for example, to deduce, from the given formal version of the closure 
principle, that, given a family (f;),., of functionals indexed by a set, I, then 
for any set, a, there is a set b which is a superset of a and is closed with 
respect to all of the fs. Thus we need only define a functional f by setting 

{@, fda) )|teT} ifwisan 

f») = [seme sequence 

x otherwise 
Then if b’ is the smallest set guaranteed by the above version of local 
closure such that 


acd’ 

and 
{f(x) | xed} cb’ 
we may set 
b= {xy | web’, yeI} 

to get 

asb 
and 

{Aà |x eb} E b, for each i e I 

as required. l 
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Thus if we have any definite number of functionals (i.e. any collection 
of them indexed by a set) then there must be a set closed with respect to 
all of them. And with, a little more work we can derive even stronger- 
looking versions of the principle from the simple version given above, 
which, in its turn, is derivable from replacement and elementary instances 
of recursion. 

But what does all of this mean? It means that the principle of localised 
closure is already incorporated in Zermelo-Fraenkel set theory. And it 
therefore follows that we cannot use the principle of localised closure 
to take us beyond Zermelo-Fraenkel because it is already embodied in 
Zermelo-Fraenkel. The belief that it is possible to get beyond Zermelo—. 
Fraenkel using this principle is simply an illusion spawned by the mistaken 
view that the principle of replacement is a principle of set formation. To 
be sure, the mistake involved is a very subtle one, and, consequently, 
one which it is all too easy to make: the very name ‘replacement’ contains 
within it the seeds of error. (Is not one simply ‘replacing’ the elements 
. of one set with other objects to form a new set?) But it is, I believe, a 
mistake to view replacement in this way, nonetheless. 


‘3.3 The case for Mahlo’s principle. 

The arguments which purport to establish the plausibility of Mahlo’s 
principle are invalidated by a difficulty very similar to that just described. 
Since the arguments are more sophisticated, the difficulty is correspondingly 
harder to discern. I can think of no way to expose it, however, other 
thar by putting the case for Mahlo’s principle as strongly as I can, and 
then pointing out where, in my opinion, it breaks down. Of course the 
possibilities for self-deception in such a procedure are obvious. But to 
anyone with doubts arising out of suspicions of my motives, I can only 
offer the following challenge: find an argument for the plausibility of 
Mahlo’s principle which is not open to the same sort of objection that 
I am about to give. 

I must begin with a formulation of Mahlo’s principle. Let f be a 
functional defined on the ordinals with ordinals as values. An ordinal 
number « will be called accessible with respect to f if 

(1) a < f(B), for some B <à 

or 

(2) « is singular, t.e. there is an ordinal A < a and a A-termed sequence 
(ag)g<.2, all of whose terms are strictly less than œ (ie. ag < o, for all 
B < A) such sup (ag)p < 1 = &. . 

If « is not accessible with respect to f it is then inaccessible with 
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respect to f. (Note that this use of ‘accessible’ and ‘inaccessible’ 
differs from the common one. ‘Weakly inaccessible’ in the customary 
terminology means ‘inaccessible with respect to the functional X’ in terms 
of this definition, for example.) 

I can now state : 
Mahlo’s Principle: Every ordinal valued ordinal functional has arbitrarily 
large inaccessible points. 

As before, the existential form can be avoided by introducing a higher 
level functional assigning to each appropriate f and g, the first inaccessible 
point of f beyond «. 

Now the argument for Mahlo’s principle runs something like this: 
Suppose, contrary to the principle, that the particular ordinal functional f 
has no inacessible points beyond the particular ordinal number g, i.e. that 
every ordinal 8 > a is accessible with respect to f. Surely, the argument 
now goes, this represents an unwarranted limitation upon the length of 
the ordinal number sequence. For the very existence of f and « as definite 


and therefore, in some sense, limited entities bears witness to the im- 


possibility of the absolutely infinite sequence of ordinal numbers beyond « 
being generated in its entirety by f. Surely the absolute infinity of this 
sequence cannot be captured by the single rule, f, of ordinal construction. 

By the sequence of ordinals beyond « being generated in its entirety 
by f I mean simply this: every ordinal number f > æ is, since it is access- 
ible with respect to f, implicitly definable from below. Thus all the ordinals 
beyond « are, as it were, given in advance by f. And it is this which 
constitutes the unjustified limitation on the absolute infinity of the ordinal 
sequence beyond «. : 

Now this line of argument is obviously very plausible. It rests upon 
the premiss that one ought to adopt those assumptions which postulate 
the greatest possible extent for the realm of sets. This is quite reasonable 
as an operating assumption. And it is possible to defend its use by means 
of arguments of considerable intuitive and heuristic weight. (See Gédel 
[1947] for a thorough discussion of this point. Number 4 of the supplement 
to the second edition is especially relevant here.) No, the flaw in the above 
argument does not lie in thts premiss, even though one has, as it were, a 
legal right to question it. The difficulty lies in the implied, but un- 
substantiated, claim made by the use of the terms accessible and inaccessible 
(And here I must confess to having picked those terms deliberately with 
an eye to improving the appearance of the argument). 

The question boils down to this: given a functional f, is it the case that 
every ordinal « accessible to f in the technical sense of the definition is, 
in fact, accessible to it in the intuitive sense of being implicitly defined 
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from below by f and the ordinals less than a? Now if the case for Mahlo’s 
axiom is to be sustained, the answer to this question must be yes. And 
the argument must run something like this: Jf æ is accessible to f then « 
ts implicitly defined by f and the ordinals below «, either 

(a) directly, via clause 1 in the definition of accessibility (ze. f(8) > æ for 

some f < g) 
or 
(b) indirectly, via clause 2 of that definition (t.e. « = sup(a)y < , where 
A <aand a < a forall 8 < A). 

We can see the full force of this argument if we inquire about the first 
ordinal which is itself accessible to f, and all of whose predecessors are, 
accessible to f, but which is not implicitly given by f in the sense required. 
If we suppose « to be this ordinal, then it is obviously absurd to maintain 
that f does not ‘give’ « if clause 1 in the definition of accessibility applies, 
i.e. if a < f(B) for some B < a. Hence it must be the case that « is singular. 
But then surely, so the argument must go, if À and a, (B < A) are all less 
_ than a, and therefore implicitly determined by f, then æ = sup(ap)s <, 
must also be so determined, since the operation of passing to the limit is a 
perfectly legitimate and acceptable one. And so, indeed, zs the operation 
of passing from a sequence to its limit a perfectly legitimate and acceptable 
one. The difficulty here lies elsewhere. The non sequitur in this argument 
is in the assumption that if is determined implicitly by f, and each 
a(B < A) is, individually, determined implicitly by f, then the sequence 
(«p)g < 2 is determined implicitly by f. Indeed, since the accessibility of « 
requires only that there exist such a sequence, it must be assumed that f 
implicitly determines all of «t (i.e. all sequences of length A with terms 
less than «) for every 4 <a. But what could be the basis for such an 

assumption? 

Perhaps we can better come to grips with the difficulty involved here, 
if we contrast the present case a simpler one. Let us imagine how we 
might go about establishing the plausibility of the supremum principle 
which asserts that, given a functional f, the values f(y) for y < A have a 
least upper bound, sup (f(y)),<,- (In the presence of suitable further 
assumptions this principle is equivalent to that of replacement.) We might 
try to justify this principle by employing the usual ‘construction’ metaphor. 
Thus we might imagine that as the ordinals 0,1,..., y, . . . less than Aare 
generated, we set in motion a new generating process f(o), f(1),..., 
f(y), ... which parallels it. Since the first process of generation must come 
to an end (as, by hypothesis, A is already given), so, too, must the second, 
thus determining the new ordinal sup(f(y)), < a% The picture here becomes 
especially clear if we take the functional f to be monotone increasing, 
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and imagine that in generating the sequence, f(o), f(1),..., f(y),.... we 
take unions of larger and larger initial segments until the union of the 
whole sequence is obtained. (I am thinking of the ordinals in von 
Neumann’s sense here, in which each one is identified with the set of its 
predecessors.) i . 

But if we attempt, by similar means, to establish that a singular ordinal 
is implicitly defined from below in a manner different from the trivial 
sense in which any ordinal is determined by the set of all smaller ones, 
we shall be forced to consider a much more complicated procedure. Thus, 
if « is the singular ordinal in question, we must imagine that as we generate 
the ordinals 0,1,..., A,... less than « we simultaneously set in motion, 
‘as parallel side processes, all possible sequences (ag) <4 for which À and 
op (B < A) are all less than «. But the difficulty is this: it would seem 
reasonable here to assume that no partially completed side process could 
contain terms not already generated in the (only partially completed) 
principal sequence, 0,1,..., A,... of ordinals less than æ. Thus it would 
seem that one would have to get all the way to « before one was able to tell 
whether « were singular or not. l 

The argument for Mahlo’s principle which I have developed here is 
really based on the following idea: if one is using the functional f to generate 
ordinals, one will be able to pass through any singular ordinal, and will 
therefore be prevented from generating the whole of the absolutely 
infinite ordinal number sequence only by the existence of regular fixed 
points of f, i.e. by what I have called inaccessible points of f. This idea, 
in turn, rests upon the conviction that every singular ordinal, all of whose 
predecessors are accessible points of f, must be visible, as it were, Jong 
before its actual generation. This means that one knows that the collection 
of ordinals one is generating is not just the set of all ordinals less than 
some singular ordinal. For one will somehow be able to see every such 
singular ordinal below the next inaccessible point of f, well in advance 
of actually having generated the set of all of its predecessors. 

As we have just seen, there are very serious difficulties in this point 
of view. And yet, it is very hard to shake off. Why should this be so? 
I believe that there is a conceptual trap here, produced by the logical 
form of the condition for singularity. The condition itself is a purely 
existential one (viz. « is singular if there exists a function whose domain is 
a proper initial segment of « and whose values form a sequence whose 
supremum is a). But when we try to form the idea of a singular ordinal, 
we do not picture to ourselves merely that such a function exists. We 
invariably have in mind „a familiar, well-defined functional, like the 
functional X, for example, some inftial segment of which imstantiates that 
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bare existential assertion. Thus, as with &,, we alwdys imagine singular 
ordinals which are defined as the limits of certain short sequences of values 
of well-known particular functionals that we already have in mind. And, 
naturally cnough, such singular ordinals do seem to be determined in 
advance. But surel a singular ordinal whose Singularity is so easily seen 
must be the exception rather than the rule. It is easy enough to manufacture 
singular ordinals, but is it really all that easy to distinguish them, when 
they have been given by some means other than direct manufacture? 
What we need to imagine is the generic case, in which we do not have a 
simply definable short sequence witnessing the singularity. Suppose, 
for instance, that the first constructably uncountable ordinal, X}, is, in 
fact, countable, when arbitrary, rather than merely constructable, in- 
creasing sequences of smaller ordinals are considered. Such a supposition 
is not at all unreasonable. Indeed, it may very well express the view of 
most set-theorists. Under that supposition, NẸ is an example of a singular 
ordinal of the simplest type—namely, an w-cofinal one—which has no 


_ definable short sequence witnessing its singularity. 


Is there a sense, then, in which a singular ordinal is ‘small’ or ‘reachable 
from below’? One might be tempted to say here that a singular ordinal 
is not closed under replacement, or, more appropriately, that it is not 
closed under the supremum principle described above which is the 
ordinal-theoretic equivalent of replacement. But that would be highly 
misleading. For it is not the singular ordinal « itself which fails to be 
closed under this principle. It is rather the set «* of all a-termed sequences 
of ordinals less than «. (More precisely, the closure in question is with 
respect to the operation which sends a function a: «->« to the function 
Brosup(a‘y),<,, defined for B in a.) Naturally, one cannot put a 
condition on the set «* without putting a condition on « itself. And in 
this case, the fact that « is not closed with respect to the function just 
described entails that there exists a function defined on a proper initial 
segment of « whose values, though lying in «, have no upper bound in «. 
But the fact entailed is a bare existential assertion. There can be no question 
of being able in general actually to define the latter function. Surely it 
would be stretching a point to claim that we knew this function in advance 
of the generation of œ, although, of course, we do know, in advance, the 
functional with respect to which «* is not closed. We must not, however, 
confuse knowledge of the one with knowledge of the other. It is therefore 
extremely misleading to speak of a singular ordinal as being small by virtue 
of the failure of some simple closure condition on the ordinal itself (t.e. 
on the set of its predecessors). : 

I feel distinctly uncomfortable using all of these metaphor based argu- 
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ments. I am extremely sceptical that any positive conclusions can be 
reached by such methods. Indeed, I strongly believe that any use of 
notions essentially involving kinetic or temporal metaphors—notions like 
construction, generation, etc.—should be strictly avoided, even if one is 
discussing ‘constructivist’ foundations of the theory of natural numbers. 
For even in that case, their employment can obscure and confuse the 
real issues. But the use of such notions is perfectly acceptable in an 
argument ad hominem of the sort just given. And it is most important 
that the informal arguments for the adoption of axioms of strong infinity 
should be faced within their own terms of reference, if their plausibility 
„is seriously to be called into question. 

It seems to me that the same misreading of the logical status of the 
replacement principle underlies the arguments both for the existence of 
strongly inaccessible numbers and for Mahlo’s principle. Both involve a 
subtle confusion of the level on which closure conditions are operating. 
The crux of the matter is this: replacement is not (like, say, power set) 
a principle of set construction. It is itself equivalent to the formal version 
of the principle that no means of set construction can exhaust all sets. l 


3-4 The need for a consistency proof. 

Such, then, are the kinds of arguments that can be advanced in favour of 
the plausibility of the natural axioms of strong infinity. Of course I cannot 
claim to have refuted them decisively, but I think I can claim to have 
sown the seeds of doubt. To be sure, we already had the legal right, as 
it were, to doubt these arguments, for it was never intended that they be 
taken as logically compelling. What I have tried to show, however, is 
that even if one is sympathetic to the vaguely articulated general principles 
underlying these arguments there are still strong grounds for suspending 
judgment. ' ; 

It has certainly not been my intention to call into question the possibility 
that these axioms of strong infinity might be justifiable. My attack has 
not been upon these axioms themselves—at least not directly. It has been 
aimed at arguments purporting to show that these axioms are already a 
natural expression of principles inherent in the Cantorian finitist outlook. 
It is vitally important for my case that the justification of these axioms 
remains a real possibility. For that possibility provides the sole motivation 
for a search for the kind of consistency proof I have in mind. As I have 
already: argued at length, practical considerations—considerations quite 
distinct from doubts about its consistency—lead us to reject the con- 
ventional first order formulation of Zermelo—Fraenkel. That formalisation 
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of general set theory is quite beyond rehabilitation by even the most 
rigorously convincing consistency proof imaginable. And it is here that 
the ‘problem of the axioms of strong infinity is relevant. For a proof of 
the consistency of traditionally formulated Zermelo-Fraenkel would be 
a partial justification of the simplest of thosé axioms, namely, the one 
postulating the existente of a single strongly inaccessible number. 

Now if the usual arguments for the acceptance of the natural axioms 
of strong infinity are rejected, then the need for a Cantorian finitary 
consistency proof of the kind I have discussed becomes apparent. Indeed, 
as I have already remarked, the need here is a very much more practical 
matter than in the case of arithmetic, even disregarding, for the moment, 
the fact that constructive consistency proofs for classical Peano arithmetic ` 
actually exist. And I am not making here the obvious point that it is 
easier to ‘doubt’ set theory than arithmetic (since it is a stronger theory), 
nor the equally obvious point that nobody has ever seriously questioned 
the consistency of arithmetic, while many believe the general methods 
of set theory to be mistaken and unjustifiable. In any case, for most of the 

‘critics of set-theoretical methods the kind of consistency proof I have 
in mind would carry no conviction whatsoever, as it would simply take the 
legitimacy of those disputed methods for granted. No, when I say that the 
need for a consistency proof is more pressing in set theory than in arith- 
metic, I am making an altogether more important point than any of these. 

What I am saying is this: the gap between the accepted informal 
mathematical methods and their formalisation in the conventional way 
as a traditional first order theory is greater in the set-theoretical than in 
the arithmetical case. In short, classical first order Zermelo—Fraenkel set 
theory is more dubious, as an embodiment of Cantorian finitism, than is 
classical first order Peano arithmetic, as an embodiment of ordinary 
finitism. At least, that is my conviction. I want now to spell out some of 
the more important reasons that have led me to maintain it. 

(1) The localness of outlook, which, as I have already explained, is an 
essential feature of the modern structuralist approach to mathematics, 
means that the conventional formulation of set theory is never really put 
to the test. In particular, the full strength of the first order formulation 
of the principle of replacement as an axiom schema has rarely been tested 
in mathematical practice. Of course mathematicians use replacement all 
the time—but in its naive form. It is doubtful that very many of them would 
be able to describe its full formal version in the traditional formalisation 
of Zermelo-Fraenkel, if, indeed they were even able to recognise the 
name ‘replacement’. The situation with regard to arithmetic is in strong 
contrast with this. Ordinary mathenfatical practice goes far beyond 
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Hilbertian finitism. Every mathematician has had practical experience of 
treating the natural numbers as a ‘completed totality’. And it is „this 
practical experience which gives us confidence thgt, even if there may be 
some justice (and perhaps even a great deal of justice) in the Hilbertian 
finitist criticism of classical Peano arithmetic, thert is little reason to 
suspect that theory of formal inconsistency. But there is nothing analogous 
to this to give us confidence in the conventional formulation of Zermelo- 
Fraenkel set theory. For there is no rival approach to mathematics lying 
beyond Cantorian finitism. 

(2) The impredicativity involved in the traditional formulation of 

. replacement as a first order axiom schema has no parallel in the arith- 
metical case. In arithmetic the difficulties with impredicativity entirely 
concern the problem of determining meanings for the logical symbols. 
(This is perhaps more obvious if we consider the usual intuitionistic 
explanation of their meanings.) This difficulty is also present in first order 
set theory. But there is an even more direct, more palpable instance of 
impredicativity in the latter case. For it is obvious that each instance of . 
the replacement axiom scheme has the intuitive function of adding sets to 
the very domain of sets over which its own bound variables vary. And there 
is nothing corresponding to this kind of impredicativity in the arithmetical 
case. 

(3) The traditional first order formulation of Zermelo-Fraenkel blurs, 
indeed obliterates, the sharp distinction between the principle of replace- 
ment, properly so called, and the principle of definition by transfinite 
recursion. However, as we have already seen, this distinction is an 
important one. In the first place, the principle of replacement is feally 
part of first order logic, whereas the principle of definition by transfinite 
recursion is not. But even more important than this, the continuity of 
the recursion operator is an altogether more problematic matter than that 
of the replacement operator. The continuity of replacement is no more 
complicated than that of comprehension, or bounded quantification. In all 
of these cases, the identity functional of the set argument gives the critical 
neighbourhood for continuity. By way of contrast, there appears to be 
no way to establish the continuity of recursion other than by employing 
recursion itself in defining the critical neighbourhood (see section 2.2 for 
details).-Again, we have here a phenomenon which has no parallel in the 
arithmetical case. 

- (4) In set theory, the passage from the finitist (in this case, Cantorian 
finitist) free variable theory, in which all quantifiers are bounded, to the 
theory in which global quantifierg are present, requires a much greater 
leap if the new quantifiers are classical than if they are merely intuitionistic. 


L 
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There is no analogue in set theory of the Gédel translation for arithmetic, 
by urtue of which the classical theory is to be seen as a subsystem of the 
intuitionistic one. Not surprisingly, the translation fails at replacement. 
In contrast with arithmetic, where the negative translation of an instance 
of induction is also ah jnstance of induction, the negative translation of an 
instance of replacement is no longer an instance of replacement, nor, 
apparently, can it be derived intuitionistically from such an instance. The 
result is that the classical globalisation of set theory is much stronger than 
the intuitionistic one. Here again the contrast with the arithmetical case 
is striking, 

The plain fact is that an inconsistency in the classical first order - 
formalisation of Zermelo—Fraenkel is not, after all, inconceivable. There 
is, to be sure, very strong, evidence against this. For many set-theorists 
have tried, without success, to prove much stronger systems inconsistent. 
This must, of course, carry considerable weight as evidence. It tips the 
balance of probability decisively in the direction of the consistency of 
. the theory. Still, however persuasive this evidence may be, it is entirely 
of a practical nature. We have no solid mathematical reasons for believing 
the system to be consistent. 

Let us forget, for the moment, the unlikelihood of an inconsistency 
occurring in the traditional formulation of Zermelo-Fraenkel, and ask 
ourselves what would be the effect if such an inconsistency were dis- 
covered. Would it be a disaster for modern structuralist mathematics? 
I think we must conclude that the answer would be no. The most likely 
source of such an inconsistency would be some sort of very complicated 
diagonalisation argument which made essential use of the peculiarities of 
the formalisation of the principle of replacement as a classical first order 
axiom schema. (Presumably the principal formula of the inconsistent 
instance of the schema would have to be at least 4, so as to give a clear 
violation of continuity.) In such circumstances, it would very likely 
be clear that what had gone wrong was the formalisation of the principle 
of replacement, and not the mathematical intuition underlying that 
formalisation. Thus theoretically, at least, such a discovery might produce 
only a very minimal disturbance in the foundations of classical mathe- 
matics. (I speak here in objective terms, of course; it is difficult to imagine 
what would be the psychological effect of such a discovery.) Indeed I 
believe that if this were the best of all possible mathematical worlds, 
then there would be an inconsistency in the classical first order formulation 
of Zermelo-Fraenkel. For in that case the foundations of set theory would 
be enormously simplified, not to mentjon the fact that a large number of 
open questions would be instantaneously settled. 
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Unfortunately, it is too much to hope for such an eventuality. The 
overwhelming balance of probability lies with the consistency of first 
order classical Zermelo-Fraepkel, however pleasant it might be to contem- 
plate the possibility of its inconsistency. But it would undoubtedly be 
desirable to supplement our, at present, purely pragmatic evidence for 
this consistency with solid, mathematical arguments. Besides, a consistency 
proof of the kind I have in mind would shed considerable light on the funda- 
mental concepts of replacement and recursion. And this, perhaps, might 
be the most valuable result to be expected. Even the mere search for 
such a proof may bring interesting phenomena to light, and may encourage 
the development of theories which are important in their own right, 
independently of any possible application to a consistency proof. (I think 
the possibility of developing the theory of continuity for functionals of 
higher level than the first is a case in point.) And finally, of course, if 
there is an inconsistency lurking somewhere in the traditional formulation 
of Zermelo-Fraenkel, the search for a consistency proof may very well 
bring this to light. . 

I shall not attempt to discuss, at any length, the directions in which 
such a consistency proof might be sought. One obvious approach would 
be to look for analogies to the existing consistency proofs for arithmetic. 
The most promising candidates for generalisation here are the proofs 
involving Herbrand’s theorem (Dreben and Denton [1970] and Scanlon 
[x973]) and the functional interpretation proofs arising out of Gödels 
Dialectica interpretation (Godel [1958]). Indeed, I have myself generalised 
Gédel’s functional interpretation for arithmetic to obtain a Cantorian 
finitary consistency proof for a version of Zermelo—Fraenkel extending 
the classical theory described above by employing intuitionistic global 
quantification (see my [1977b]). On less orthodox lines, it might be 
possible to make something of the interesting argument sketched by 
Cohen in his [1971]. 

I think that all of these considerations show that the search for a 
Cantorian finitary consistency proof for classical first Zermelo—Fraenkel 
set theory would be a reasonable undertaking. And the fact that the 
consistency problem can be formulated in such a way that an attack on it 
seems feasible should provide all the incentive necessary. 


APPENDIX 


I shall give a somewhat more detailed description of the general system 
of classical logic discussed in sections 1.7 and 2.2. For a complete formal 
exposition of the theory, my article ‘A New Formulation of Elementary 
Logic and Set Theory’ must be consulted. 
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Terms and formulas are the two fundamental kinds of categorematic 
(independently meaningful) expressions in the formal language of the 
system. When interpretations have been agsigned to their parameters, 
terms denote objects and formulas express propositions. 

Complex formulas can be built up from simpler ones using propositional 
connectives 

[A a B], [A v B], etc. 
or quantifiers 
(Wx € t) Afx], (dx e t) Afx] 
Basic formulas consist of simple assertions of membership 


set 
inclusion 
. scat 
or equality 
$= ft 


built up from terms using the copulas e, S, and =; predicate parameters 
* (of arbitrary degree) may also be used. 

P(t}, Q(s,t), R(r, $, £), etc. 
In any application, these must be assigned relations of the appropriate 
degree as their interpretations. 

Terms are built up in the usual way from individual constants (including 
¢ for the empty set, and w for the set of finite von Neumann ordinals) and 
parameters, with functional constants for the boolean operations of union, 
intersection and difference 

sUtsot,s—t 
the power set, sum set, and pairing functionals 
PH), UO, {st} 
and the binding operators which produce comprehension terms 
{xet|A [x]} (the set of all x in ¢ such that A[x]) 
replacement terms 
{t[x] |x es} (the set of all ¢[x] such that x is in s) 
and recursion terms 

Rec,(t{x],7,8) (recursion with respect to x in ¢[x] along r evaluated at s). 

It is technically convenient to keep individual parameters (free variables) 
and individual binding letters (binding variables) separate. This permits 
the unrestricted substitution of terms for terms and formulas for formulas 
without danger of clash of variables. An expression which is like'a term 
or formula except for having occurrences of binding letters in place of 
certain occurrences of parameters is called a pseudo-term or pseudo-formula. 


On the Consistency Problem for Set Theory 167 


Objects other than sets are allowed in the theory. When terms denoting 
such objects occur in contexts in which only terms denoting sets would, 
strictly speaking, be appropriate, they are treated, extenstonally as if they 
designated the empty set. For example 


fics n 


it is always true if ż designates a non-set; and in such a case 
P(t) = {$} 
so that power set of any object can be interpreted as the set of all sets 
which are contained in that object. Of course, the law of extensionality 
' SStatSs.>.s=t 
applies only to sets. An object t is a set if, and only if, 
to¢g¢.>.t=¢gde 
Either this can be taken as a definition, or a primitive predicate, Set, 
can be introduced and the appropriate axiom schema adopted. 

What is it that corresponds to conventional first order logic in this 
general system of classical logic? There is more than one possible answer ` 
to this question, but I shall give what is probably the simplest. A set Z of 
formulas is called a first order set if no recursion term occurs in 2, nor 
any term involving the power set, sum set, or pairing functional constants, 
and if all the terms and pseudo-terms occurring in 2 can be classified 
into set terms and element terms so that 

(i) Every pseudo-term occurring in 2 is a set term or an element term, 
but not both. 

(ii) All binding letters are element terms. ; 

(iii) No element term occurs immediately to the right of a membership 
sign in formulas of 2, and no set term occurs immediately to the 
left of such a sign. 

(iv) Only set terms occur on eithér side of an inclusion sign in 2 
formulas, and only element terms occur on either side of an equality 
sign in such formulas. 

(v) Only element terms occur as arguments in predicate or functional 
parameters of 2. 

(vi) The terms giving the domain of variation in quantifiers, com- 
prehension terms and replacement terms of X must be set terms, 
as must be the argument terms for boolean functionals occurring 
in Z. 

(vii) The pseudo-term s[x] in a replacement term {s[x] | xet} of X 
must be an element term. 

A formula A is a first order formule if {A} is a first order set. 
The logic of first order sets of formulas can be developed along 
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traditional lines. I am speaking here both of the proof theory and the 
model theory of the logic. In particular, cut-free methods of formal proof 
which are complete with respect to the obvious standard semantics are 
available, Thus nothing is lost of the standard theory of first order classical 
logic in passing from, the usual formulation of that logic to the more 
general formulation sketched here. 

In formal terms the difference between the present approach to first 
order logic and the traditional one is this: Instead of dealing with an 
infinity of distinct formal languages, one deals with a fragment of the 
single formal language belonging to the unified general system of set 
theory and classical predicate logic. And instead of formalising the model. 
theory of all of those languages in a formalised set theory whose own 
formal language is among, those under consideration, one simply carries 
out one’s semantical arguments inside the general theory. Thus instead 
of defining a group to be ‘any model of the formal axioms 2 in the 
language -2” one would say, more directly, that ‘a group is a set G equipped 
- with a binary operation such that... .’ First order logic thus becomes the 
logic of certain kinds of arguments and definitions rather than the logic 
of certain kinds of formal languages, and the entirely artificial distinction 
between a formal axiomatic definition and a direct one disappears. 

However, different these approaches may seem, it is neverthless true 
that all of the theorems, definitions, techniques, etc. of the traditional 
formulation of first order logic carry over, mutatis mutandis, to the logic 
a first order sets of formulas as defined above. For example, that logic 
can be formalised in terms of natural deduction, or using tableaus, or 
Genitzen’s sequenzen, etc. And all of the familiar theorems go through 
using obvious modifications of the traditional proofs. Nothing is lost. 
A minimum of difficulty is required in switching from the one point of 
view to the other. Indeed, the difference between the two approaches is 
not very significant mathematically. But foundationally it is all important. 

The axioms and rules of inference for the general theory fall into 
seven main groups. 


I. Axioms of Elementary Logic. These axioms and rules are generalised 
versions of the usual first order axioms and rules. (They are not, 
however, stated only for first order formulas, but for arbitrary ones).* 


II. Axioms of Sethood. These axioms establish the conventions governing 
the relations between sets and non-sets. 


1 The axioms of replacement and comprehension, the axioms characterising the Boolean 
functionals (union, intersection and difference), the axioms of equality, and the 
fundamental axioms governing the use of*the membership (e) and inclusion (CG) 
relations, are all included in this group. 
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III. Axioms of the Basic Functionals. These axioms characterise the power 
set, sum set, and pairing functionals. 
IV. Axtom of Foundation, This asserts the well- foundedness of the 
membership relation. 
V. Axiom of Choice. ° ° 
VI. Axiom of Transfinite Recursion. This characterises the recursion 
operator in the manner described in section 2.2. 
VII. Axiom of Infinity. This characterises the denotation of the constant 
w as the set of all von Neumann natural numbers. 
Detailed formulations of all of these axioms and rules are to be found 
_in Mayberry [1977q]. 
University of Bristol 
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Discussions ; 


ROBINSON ON MY VIEWS: A CORRECTION . 
In his recent discussion of split-brain man Daniel Robinson [1976] imputes 
to me the view that if Smith’s right cerebral hemisphere were transplanted into 
one human body and his left one into another, the result would be two Smiths, 
i.e. two humans who are both Smith (p. 76). One reason he gives for rejecting 
this claim is that in fact the two hemispheres are neither symmetrical in function 
nor identical in content (p. 77). 

But in the only reference Robinson makes to my work (Puccetti [19736]) 
I did not discuss this at all. Where I did discuss it (Puccetti [1973a]) the view 
Robinson characterises as mine was in fact the pne I attacked, a view put 
forward originally by David Wiggins [1967] and elaborated by Derek Parfit [1971]. 
Parfit used it to show how in certain circumstances we might be forced to give 
up the language of personal identity, since Smith R and Smith L could be in 
different rooms at the same time and the question, ‘Is Smith in this room or 
not?’ admits of no clear answer. : 

My solution was to suggest that each cerebral hemisphere of Smith prior 
to the double transplant constituted the physical basis of a person; if so they 
always occupied the same room at the same time when callosally conjoined in a 
single body but need not after the operation. In this way one can save the 
transitivity of personal identity against the Wiggins—Parfit thought-experiment. 

Interestingly, one of the reasons I gave for ‘Smith’ having really been two 
people all along was just the asymmetry of function and content between the 
two cerebral hemispheres Robinson invokes against me. 

J think it worthwhile to set the record straight on these matters, for otherwise 
readers of the Journal are certain to be confused.t 

ROLAND PUCCETTI 
Dalhousie University 
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1 In a letter dated 8 June 1976, Professor Robinson informs me that this mistake arose 
from unintentionally substituting my name for the words ‘the double-mind’ before 
the word ‘thesis’ in paragraph 4, p. 76 of his discussion. I accept this clarification 
gratefully, but wonder why, if he wished to attack a view I also opposed, he did not 
cite the paper in which I did so nor mention my disagreement with that view. 
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I INTRODUCTION 


The assertion that a particular time keeping device measures uniform time can. 
ohly have empirical rather than conventional content if, in some way, it is possible 
to compare intervals at one time with intervals at another, to test whether or 
not they are equal. But since we cannot transport temporal intervals through 
time to make such a comparison, it is only by convention that we can call a 
particular time scale uniform, a view that has been advocated by Poincaré,? 
Reichenbach? and others. This view has been challenged by Ferrel Christensen? 
. who claims that ‘there is a procedure which can in principle be used to determine 
inductively whether or not a process is (non conventionally) strongly periodic’ 
that is uniform. I will show that this claim is unfounded. 


2 THE CONVENTIONALITY OF TIME SCALES 


One measure of the passage of time is the rotation of the earth relative to the 
distant stars, but what meaning can be given to the statement that the earth is 
really slowing down or speeding up? If we define the unit of time by the earth’s 
rotation, then that rotation is tautologously constant, A/A = 1. 

However, if we choose to measure time by the oscillations of a quartz crystal, 
then we can meaningfully say that the earth is slowing down or speeding up 
relative to that crystal clock, but we can equally say that the crystal clock is 
speeding up or slowing down relative to the earth’s rotation. We cannot meaning- 
fully say of one measure that it is uniform and of the other that it is not; it is 
only the relation between the two measures that has empirical content, and such 
statements are purely numerical statements. Indeed, this is always the case: 
statements about the physical world that have empirical content are statements 
about relationships; they are independent of the scales used for measuring 
length, time or mass and are expressible in terms of pure numbers. For example, 
the Newtonian Law of Inertia, ‘a body continues in its state of rest or uniform 
motion in a straight line’ is without empirical content until we have chosen 
a scale of time and distance measurement, if distance is for the sake of argument, 
measured by a ‘rigid rod’ and time by the earth’s rotation, this law takes the 
usual form d®x/dt? = o, but if we choose some other time, say r = tẹe*to, to 
measure time, this law becomes 


dx idx 
ise -— = 0 
r n 
1 Poincaré [1928]. 2 Reichenbach [1958]. 3 Christensen [1976]. 
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But the statement with empirical content is a statement that the test particle 
covers equal distances (measured by a number times our conventional choice 
of unit of length) in the temporal interval during which the earth completes 
one revolution relative to the distant stars. We can test the empirical truth of the 
statement by measuring in t og 7 time. It is the relationship between the particle’s 
motion and the earth’s rotation that has empirical goftent—not a formula 
d*x/di* = o. 


3 NATURAL AND REGULAR CLOCKS 


It is an empirical fact that there are many measures of the passage of time 
(oscillations of a quartz crystal, hydrogen maser clocks, caesium atomic clocks, 
.the earth rotation, pendulum clocks...) that keep time with each other—that 
is the ratio of any two measures of time is independent of time. It is therefore 
convenient to measure time by one such clock since the resulting description 
of the physical world may be simpler (measured by the complexity of the 
differential equations of Physics!) but this does not make such a clock ‘uniform’ 
in an empirical as opposed to conventional way. It is also an empirical fact that 
different clocks based on the same principle, e.g. two pendulum clocks, keep 
time with each other, no matter when we stop and start them. It is just such . 
behaviour that makes physics a worthwhile pursuit; there exist processes in the 
Universe that are regular—given similar conditions, we obtain similar results. 
If this were not so, the search for physical principles or laws of nature would 
be futile. But the repetitive behaviour is independent of the conventional 
choice of the measure of time, if A/B is constant in time it does not matter in 
what units A and B are measured to establish this empirical result. 


_4 SELF UNIFORM CLOCKS AND CONVENTIONALISM 


Ferrel Christensen! has appealed to this regularity of nature to claim that ‘there 
is a procedure which can in principle be used to determine inductively whether 
or not a process is (non conventionally) strongly periodic’ that is, uniform. 
Christensen’s argument is to compare two clocks which keep the same time, 
to stop one for a period then start it agajn, and compare the clocks. If they still 
keep time, the clocks are strongly periodic. Most ‘good clocks’ pass this test, t.e. 
the period of a pendulum is not affected by stopping and starting. Most ‘bad 
clocks’ fail this test, e.g. watches which ‘go slow’ as the mainspring nears the 
end of its unwinding. Now, while it may be an empirical fact that there exist 
‘self uniform’ clocks as judged by the stop-start test, this is simply an empirical 
fact about the world—it is true or false independently of the time scale con- 
ventionally chosen as uniform. If the periods T}, T of the clocks are the same after 
the stop-start test, is does not matter whethet I measure them with my ‘faulty’ 
clock, since the empirical statement is T,/T = 1, which is independent of my 
conventionally defined time scale. Of course, for the practical physicist it is 
convenient to be able to build copies of his clock that do keep time after stopping 
and starting since he does not then have to refer back to his basic unit— 
convenient but not correct. He could by convention choose one such set of clocks to 


1 Christensen [1976]. 
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be measuring what he chooses to call uniform time—but it is clearly a con- 
ventional choice. If all such ‘self-uniform’ clocks keep the same time it would 
probably be a very convenient choice, but they do not. 


5 NON UNIQUENESS OF SELF UNIFORM CLOCKS 


Consider a light clock “defined by successive reflections of a photon between 
two mirrors A and B, the round trip travel time defines the interval. Distance 
is determined by the clock (4B) and a constant light velocity of unity. The 
distance AB is then constant and equal to one. Let (CD) be a similar clock that 
keeps the same time as (AB), to fix ideas, let C be at rest relative to A, that is, 
its distance relative to A as determined by the clock (AB) and a light signal is 
constant, then D must also be at rest relative to A. To stop the clock CD, we. 
let the photon escape but keep the rest of the configuration the same, that is, 
C and D remain at rest relative to A as determined by (AB) and a light signal. 
On restarting (CD) (by insetting another photon) it will keep the same time as 
(AB). These clocks are self uniform. 

But now consider another clock AE, as determined by the (AB) clock and 
a light signal Æ has a constant velocity V, and therefore in (AB) time AEF is 
_ increasing and the interval between successive reflections at A is not constant 
in (AB) time. Indeed, as shown in the appendix, the relation between (AE) 


time and (AB) time is 
2V tap 
ae blah] 
aE In(t-+V)—In(1— V) 
If now we construct another clock (AF) with identical properties to (AZ), that 


is F also has a velocity V relative to A as determined by the (4B) clock, and a 
distance of unity at time £ 4p = 0, its time is 


2Vt 4, 
eo tit) 
4E Ia(t+-V)—In(Ga—V) 

so that t4, = t4p. The clocks (AZ) and (AF) keep time with each other. More- 
over, as measured in (AE) time, E is at rest relative to A, always at a distance 
of unity on the (AE) scale, so, too, is F since on the AF scale the round trip 
travel time AFA is also unity, being equal to AEA. The two clocks (AE) and (AF) 
have the same relationship between themselves as the (4B) (CD) clocks. They, too, 
are self uniform. Similarly, with any other pair of clocks with a different 
expansion velocity on the (AB) time scale. 

Hence there exists an infinite family of possible clocks, each family is self 
uniform, but each family keeps a different time. We cannot say that one is more 
uniform than the other; we must have to accept the multiplicity of self uniform 
clocks and set one to be uniform by convention. 








6 PHYSICAL TIME SCALES 


While we have demonstrated that there ace many “self uniform’ clocks, it may 
be argued that all natural clocks keep the same time. By natural, we mean 
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clocks based on some physical principle. For instance, we can measure time 
by electromagnetic clocks, of which we can imagine at least three versions 
defined by different atomic transitions (i) transitions between different principal 
levels, (#) transitions betweeh fine structure levels,*(i#) transitions between 
hyperfine levels. Another set, of clocks would be nuclear decay clocks, defined 
so that a radio active nucleus decays exponentially in time. But there are other 
clocks, the rotation of the earth, time measured by a test particle far removed 
from other matter covering equal distances in equal times, the rotation of the 
earth round the sun, and a cosmological clock defined such that the density of 
the Universe is constant, or such that the round trip travel time between 
galaxies is constant. In some sense, the cosmological clock is a measure of 
‘absolute time’ since it uses the gross properties of the Universe rather than 
. local physical phenomena. 

Not all these clocks keep time one with another. The observed redshift 
magnitude relation in cosmology demonstrates that cosmological time t, is 
different from principal atomic time t,. In the Einstein de Sitter Cosmology 
of General Relativity 


tact}. 
Nor does a gravitational clock necessarily keep the same time as atomic time, 
or cosmological time. For example, in the Jordan—Dicke! scalar tensor theory ` 
of gravity, the orbit gravitational time is 
toc t6 +80) 
whereas the cosmological time is 
toc H2 tO(4+ 30) 


where w is the coupling constant of the scalar field (w 2 10). These time scales 
are all different. Empirically, we have no evidence for such differences but then 
experiments have not yet achieved the required accuracy. 

At the present time there is no evidence to suggest that the nuclear and 
different electromagnetic clocks keep different times, but again the required 
experimental accuracy has not been achieved. In principle, it is possible to 
conceive of non-metric theories of gravity in which these differences would be 
detectable by measuring the second order gravitational red shift of different 
clocks.” 

While most possible differences between different physical time scales are 
still conjectural, the difference between atomic and cosmological time is 
demonstrated by the expansion of the Universe, a difference emphasised 
particularly by E. A. Milne.? Over normal human time scales, this difference is 
insignificant, but becomes large on a cosmological time scale. At the present 
level of empirical data, we do not know whether other physical time scales agree 
with ¢, or £, or differ from both. For instance, we have no empirical foundation 
for taking the proper time ds in general relativity to be measured by atomic 
clocks rather than cosmological clocks, 

Since we know of the existence of at least two distinct physical time scales, 
we can only call one of them uniform by convention. But it matters not which 
we choose, it is only the relationship between phenomena that has empirical 


1 See Dicke [1966]. * Will [1974]. 3 Milne [1948]. 
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content; in Eddington’s! words ‘the theory of the expanding Universe might 
also be called the theory of the shrinking atom’, and this relationship can be 
obtained empirically no matter what time scale is used to measure both 
phenomena. Scales of measurement are convenient intermediaries, but the 
empirical results about the world are scale independent; they are numerical 
statements about the ‘relationships between systems, and it matters not what 
measure of the passage of time we conventionally define as uniform. 


APPENDIX RELATIONSHIP BETWEEN TIME SCALES OF LIGHT CLOCKS 


Consider the two light clocks 4B and AE, as measured by the AB clock E has 
a velocity V. We shall assume the validity of special relativity. Figure 1 is a 
space-time diagram of the situation, the dashed line is the path of the photon 
in the AE clock. . 








The photon is emitted at Ro reflected at R,, then R,..., E is coincident with 
B at R, when t,, = tag = O (by choice). Since the travel time Ry R, R, is 
unity on the ¢,, scale (by definition of the unit) we assign times—} to Ry, +4 to Ry. 
If k is the ratio of the interval between the emission of two signals by A, to 
the interval between reception hy B, and vice-versa, the interváls. between 
successive reflections at A are as shown, 1, k2, kt.. 

After n round trip reflections between A and E (n ticks of the AE clock) the 
number of ticks on the AB clock is 

1—k* 
tap = IHH... Saat T 


1 Eddington [1932], p. 90. 
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or 
_ dHn[(R8—1)ty,+1] . 
S Ink 7 
The factor k is the doppler interval factor [(1-+-V/(1-—V)]* and so 
2Vt . 
tae = Ar zA 2 
in (+ V)—In(@—V) 
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POPPER VERSUS PEIRCE ON THE PROBABILITY OF SINGLE CASES 


Recently R. W. Miller has argued in this journal that Peirce’s later interpretation 
of probability as a ‘would-be’ is to be preferred to Popper’s interpretation of 
probability as a propensity since, despite their similarities there are differences 
which show the superiority of Peirce’s view (Miller [1975]). This conclusion 
took me by surprise because my own comparison of the two views had led me 
to a contrary conclusion (Settle [1974] and [1975]). Perhaps there was something 
in Peirce’s view I had missed; perhaps I had misjudged the significance of the 
differences. A study of Miller’s article, however, convinced me that it was 
Miller and not I who was mistaken and that his error turned on his seemingly 
not comprehending the significance of the problem of the probability of singular 
events, and thus on his not understanding Popper’s solution of it (Popper 
[1957], [1959] and [1967]). (See Popper’s comment on my [1974] discussion, 
Popper [1974], pp. 1117-21). 

I endorse Miller’s view that a discussion of outmoded views may shed light 
on contemporary problems. Especially, I think the contrast between Peirce’s 
and Popper’s views is illuminating in a way Miller failed to notice. Peirce saw 
clearly the clash between his interpretation of probability and the common- 
sense view that it was not meaningless to assert probabilities of singular and 
even unrepeatable events. The existence of the clash must have been particularly 
frustrating to Peirce whose introduction of the idea that probabilities were rooted 
in the physical properties ofthe chagce set-up made such a signal break both 
with the frequency interpretation and with the more popular subjectivist 
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interpretation. But as long as Peirce understood a ‘would-be’ as a virtual habit, 
he was unable to provide an analysis of the probability of unrepeatable events. 
THis point is worth some discussion, aside from the question of settling whose 
view to prefer, Peirce’s*or Popper’s, because if raises the rather fundamental 
question of the difference between the real world and its properties on the one 
hand, and the scientifig model of the real world, on the other. Before taking up 
the weakness of Miller’s discussion of Popper’s interpretation I should like 
to remark upon this metaphysical distinction. (For further discussion see my 
[1973] and [1975].) 

In raising the question of the distinction between reality and the scientific 
model of reality, I already have taken sides on several important philosophical 
disputes. There is not space here to have them all out, but let me mention three. 
Against the phenomenologists, I assume the existence of a real physical world. 
which causes perceptions without being accurately presented in perception. 
Against the positivists, I assume that language imposes an order on events 
which they may not posses merely by virtue of the use of general nouns, and 
of verbs that name types of actions. Against the inductivists, I assume that 
theories in science are conjectural and corrigible, intended to model reality 
more or less well. Given these assumptions, which I think are shared by Popper, 
. it is reasonable to distinguish between a real situation which may have a pro- 
pensity p(a) to realise event a, on the one hand, and a model of the situation 
under a certain description which may have a propensity p’(a’) to realise event a’ 
which models a. Popper has discussed the problem of giving varying descriptions 
of a single situation and deriving different propensities depending upon the 
descriptions (Popper [1967], pp. 33-9). 

I do not recall a discussion of this problem in Peirce’s writings on probability 
as a ‘would-be’. Rather I got the impression that Peirce identified the real 
situation with the situation as modelled in his description of it. In this case, 
Peirce could hardly be expected to make either of the two moves at which 
Popper hints which might resolve the impasse caused by understanding 
probability to be a propensity to generate (virtual) sequences. Of course, as 
Popper says, his interpretation ‘may be presented as retaining the view that 
probabilities are conjectured... frequencies in long (actual or virtual) 
sequences,’ ([1959], p. 37) but this would be to risk confusing a concept with 
how it may be measured—an old operationalist trick. Let me take up the 
first of Popper’s hints in the context of Peirce’s announcing himself defeated. 

Peirce’s problem was this. ‘If a man had to choose between drawing a card 
from a pack containing twenty-five red cards and a black one, or from a pack 
containing twenty-five black cards and a red one, and if the drawing of a red 
card were destined to transport him to eternal felicity, and that of a black one to 
consign him to everlasting woe, it would be folly to deny that he ought to 
prefer the pack containing the larger proportion of red cards, although, from 
the nature of the risk, it could not be repeated.’ (Peirce, cited in Miller [1975], 
p. 128). How is one to reconcile the advice to choose from the pack with the 
most red cards, with the view that probabilities of single cases make no sense? 
Miller’s attempt to rescue Peirce’s good name as ‘the father of Pragmatism’ 
(p. 128) by suggesting our gambler provisionally, accepts the 25/26 chance of 
drawing a red card from the first deck°quite misses the point. What Miller 
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suggests our gambler accepts is strictly, from Peirce’s point of view, vacuous. 
The 25/26 chance of drawing a red cannot, on Peirce’s view, refer to the next 
draw but only to a sequence of draws. On Peirce’s view, common sense errs 
in inferring from the probability of a kind in a sequente that an instance of that 
kind has a probability. One cauld not even infer from a probability of an instance 
to the probability of a kind in a sequence, unless one adapted a view like Popper’s 
that single case propensities make sense. But this is just the view Miller rejects, 
unless I misread him. 

The first of the two moves Popper makes, which leave room for the view 
that single case propensities make sense, is to emphasise that probability is a real 
physical propensity of the generating conditions rather than a quasi-logical 
property of the result. Common talk about the probability of drawing a red card 

. or about heads being thrown with a fair coin errs in omitting reference to the en- 
vironment and to certain random features in the conditions which generate the 
outcomes. The probability in question is the propensity of a deck to yielda red card 
when randomly drawn from or the propensity of a combination of a coin, spun 
randomly, with a gravitational field, and a flat elastic surface below the coin, 
to produce the event of the coin’s lying head uppermost on the flat surface. 
Peirce never explicitly held this view, although in his [1910] note he hinted at 
a connection between the probability of a certain kind of throw and the com- - 
position of a dice. Let me stress: the probability of a red’s being drawn is a 
propensity of the deck in the context of random or blind selection. It has this 
propensity (whatever its value) partly by virtue of its composition: the pro- 
pensity would vary if the composition varied. (The propensity to yield a red 
would also vary if the manner of selection were varied.) 

Popper’s second move is to regard propensity as a weighted possibility. If 
more than one outcome is possible, each possible outcome will have a weight 
less than unity. In some cases, a theoretical modelling of the chance situation 
may enable us to calculate conjectured weights of the possibilities of some 
outcomes of some situations and thus to predict the chance of a single outcome 
or the frequency of occurrences of a certain kind in a long run, though of 
course, predictions of chance can be tested only statistically. Thus, Popper’s 
theory, but not Peirce’s, may give a person a valid reason to back chance in 
a single case, if he knows a theory from which to infer a measure of the chance. 
More importantly, Popper’s theory, but not Peirce’s, makes sense of talk about 
the probability of a single event’s occurring, even in the absence of theories that 
might enable us to calculate a conjectured probability. 

This is a case of philosophy redeeming rather than, as is more usual, debunking 
common sense. 

TOM SETTLE 
University of Guelph 


REFERENCES 


Maure, R. W. [1975]: ‘Propensity: Popper or Peirce?’ British Journal for the Philosophy 
of Science, 26, pp. 123-32. 
Puce, C. S. [1910]: ‘Notes on the Doctrine of Chances’, in C. Hartshorne and P. Weiss 
(eds.): Collected Papers of Charles Sanders Peirce, vol. 2 (1931), PP. 405-14. 

Poprsr, K. R. [1957]: “The Propensity Theory of the Calculus of Probability and the 
Quantum Theory’, in S. Korner (ed.): Observation and Interpretation. 


M 


180 Tom Settle 


Poppsr, K. R. [1959]: “The Propensity Interpretation of Probability’, British Journal 
Jor the Philosophy of Science, 10, pp. 25-42. 

Poprm, K. R. [1967]: ‘Quantum Mechanics without “The Observer”’, in Mario Bunge 
(ed.): Quantum Theory Reality. Springer-Verlag, pp. 7-44. 

Popper, Sir K. R. [1974]: ‘Replies to my Critics’, in P. A. Schilpp (ed.): The Philosophy 
of Karl. R. Popper. Open Court, pp. 961-1197. œ 

SETTLE, T. W. [1973]: “Are some Propensities Probabilities?’ in R. J. Bogdan and I. 
Niiniluoto (eds.): Logic, Language and Probability. Reidel, pp. 115-20. 

SETTLE, T. W. [1974]: ‘Induction and Probability Unfused’, in P. A. Schilpp (ed.): 
The Philosophy of Karl R. Popper. Open Court, pp. 697-749. 

SETTLE, T. W. [1975]: ‘Presuppositions of Propensity Theories of Probability,’ in 
G. Maxwell and R. M. Anderston, Jr. (eds.): Induction, Probability, and Confirmation. 
University of Minnesota, pp. 388-415. 


Brit. J. Phil. Sci. 28 (1977), 181-194 Printed in Great Britain 181 


Review Articles ; 


A LOGICAL EMPIRICIST LOOKS AT BIOLOGY" 
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X RUSE’S STRATEGY 


The strategy which Michael Ruse adopts in his The Philosophy of Btology is to 
resolve certain disputes over the nature of biology using the conceptual tools 
provided by the logical empiricist analysis of science. His exposition centres 
around two topics: the relation between Mendelian genetics, population genetics, 
the synthetic theory of evolution and molecular biology (six chapters) and the 
theoretical commitment which any system of classification must have if it is to - 
be scientifically significant (two chapters).-In addition, he devotes a chapter to 
the theoretical character of genes in contrast to phenotypic traits and another to 
the explication of goal-directed systems in terms of adaptation. The tools of the 
logical empiricist analysis of science which Ruse utilises are the covering-law 
model of scientific explanation, scientific theories as hypothetico-deductive 
systems, reduction as theory derivation, the analytic-synthetic distinction, 
laws as true, universal, non-analytic statements which are in some sense nomically 
necessary, and the multidimensional distinction between theoretical/hypothetical/ 
non-observable entities and observable/really existent/non-theoretical entities. 

Ruse is as aware as the next person that the logical empiricist analysis of 
science—construed broadly to include the work of philosophers from Bridgman, 
Braithwaite and Brodbeck to Hempel, Nagel and Smart—has undergone a 
barrage of criticism during the past two decades, not just from intransigent out- 
siders but from its own proponents and their students as well. If the ability to 
sustain self-criticism is a sign of intellectual viability, logical empiricism is alive 
indeed. But Ruse believes that many of the objections which have been raised are 
misdirected, that in most respects the logical empiricist analysis is good enough 
to handle the sorts of issues which have arisen in the philosophy of biology, and 
that at the very least it is no less applicable to biology than to physics. Ruse seems 
to have concluded that now is not the time to use the apparent discrepancies 
between biology and the logical empiricist analysis of science to cast doubt on 
the most highly articulated view of the nature of science currently extant. Rather 
at this stage philosophers might spend their time more profitably by using the 
conceptual tools of logical empiricism to shed as much light as possible on the 
nature of biology. 

There is much to be said for the strategy which Ruse has adopted, but I am 
* Review of Michael Ruse [1673]: The, Philosophy of Biology. London: Hutchinson 

University Press, Hardback £3, Paperback £1. Pp. 231. 
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hardly the one to evaluate it dispassionately since I too have written a book on 
the philosophy of biology taking just the opposite tack (Hull [19744]). The issue 
whith divides us is whether or not a precise and rigorous version of the logical 
empiricist analysis of science can be applied to bidlogy without undue distortion. 
In my work I have maintained a rather stiff-necked attitude toward both logical 
empiricist philosophy* of science and biology. The most attractive features of 
logical empiricism have been its precision and rigour and its concern with science. 
As I see it, the science with which philosophers of science must deal is real 
science, not some simplistic textbook parody or self-serving rational reconstruc- 
tion, and that the precision and rigour of the tools developed by the logical 
empiricists can be watered down only at considerable cost. 

Ruse shares my attitude toward the ‘science’ in the philosophy of scicnce. The 
biological theories which he discusses are genuine biological theories, not 
caricatures. But whenever a discrepancy between the logical empiricist analysis 
of science and biology begins to emerge, Ruse tends to substitute a reasonable, 
less formal notion for the tfaditional logical empiricist view. For example, no 
biological theory has yet been axiomatised completely in a strict sense of that 
term, but in a loose sense biological theories have been axiomatised sort of or at 
least that is the ideal toward which biologists are striving. The same can be said 
. for the covering-law model of scientific explanation and the kinds of explanations 
one finds in biology, the Nagelian model of theory reduction and the relation 
between biological theories, etc. However, the more reasonable Ruse gets in his 
interpretation of the logical empiricist analysis, the more he sacrifices the 
precision and rigour of that analysis. 

In this review I propose to discuss the chief issues in the philosophy of biology 
raised by Ruse both for their own sake and as examples of the relative strengths 
and weaknesses of the logical empiricist analysis of science. My criticisms tend 
to be directed more at the conceptual tools of logical empiricism than at Ruse’s 
particular use of them. To the contrary, Ruse presents a more convincing case 
for the general applicability of these tools than I thought possible. What is more, 
the criticisms which I do make result from the standards established by the 
logical empiricists themselves and arise out of that tradition. And finally, I must 
add that I agree with Ruse’s major conclusion. With the possible exception of 
the presence of selection theories in biology (and psychology) and their apparent 
absence in physics, I agree that the logical empiricist analysis of science is as 
applicable to biology as to physics—viz., it provides some initial clarification but 
in the long run is not good enough. In addition, I do not think that further 
tinkering with the analysis will do any good. It needs a complete overhaul. 


2 THEORETICAL US OBSERVABLE ENTITIES 


Early in the history of genetics, the legitimacy of postulating such hypothetical 
entities as genes to explain the transmission and distribution of observable traits 
was a live issue, and the support which logical empiricists have given to the 
crucial role of theories, theoretical terms and theoretical entities in science 
would have been helpful. As Ruse argues, genes are paradigm cases of theoretical 
entities. Initially genes were extremely hypothetical and far from observable. Also 
the diverse gene concepts which have been. proposed through the years have been 
closely connected to particular theories of heredity. Genes are now a good deal 
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less hypothetical than they were and as observable as anything that small is liable 
to get. However, the gene concept remains as theoretically committed as when 
it was first proposed. The situation appears to be quite different with respect to 
phenotypic traits which range from such observable structures, properties, and 
processes as eye colour, thes» relative position of valves in the makeup of a 
crustacean, or the shape of a mouth to such difficult to observe entities as proteins. 
In fact, from the point of view of molecular biology, proteins are the ultimate 
phenotypic traits. The problem, of course, is the degree of theory-ladenness of 
the more ‘observable’ phenotypic traits: 


The gene itself may not be hypothetical, but it was, and many aspects of it 
still are; conversely, generally speaking phenotypes are not and never were 
hypothetical, although some were, some still are, and the same holds of 
some phenotypic characteristics. Finally, the gene does seem to be strongly 
linked with theory, much more so than phenotypes; but perhaps even here, 
because of some phenotypic theory-ladenness, ‘the full story is rather more 
complex (p. 24). 


The full story depends upon how loosely one construes the notion of a scientific 
theory and how sensitive one is to theory-ladenness. At one extreme are morpho- 
logical, embryological, and cytological ‘theories’. If they count as genuine ` 
scientific theories, then all the terms used to describe organisms are as theory- 
laden as ‘gene’. A mouth may be a mouth to the man on the street, but not every 
opening at the anterior of an organism counts as a mouth to a biologist, nor can 
the anterior of an organism be determined just by looking and seeing. Not all 
fruits are fruits either, nor all animals animals. Some of these terms may have 
originated in ordinary discourse, but they are now technical terms in biology, as 
technical as ‘work’ in mechanics. But even if one wishes to deny the status of 
theories to the sorts of formulations one finds in morphology, embryology and 
cytology, certainly evolutionary theory counts as a genuine scientific theory, even 
if it is not, as Popper once remarked, a very good theory. However, evolutionary 
theory entails the notion of evolutionary homology. According to this notion, not 
everything which pre-evolutionary biologists were inclined to term the ‘same’ 
trait are actually evolutionarily homologous. Thus, if two instances of the same 
trait can count as genuine instances of the sdme trait only if they are evolutionarily 
homologous, then even the most pedestrian phenotypic trait is heavily theory- 
laden. 

Elsewhere Ruse [19704] has argued that in spite of theory-ladenness, two 
different scientific theories can often be brought to bear on the same phenomena; 
for example, in the nineteenth century fossils presented problems for both the 
special creationists and the evolutionists, and both groups meant the same thing 
extensionally and intensionally by ‘fossils’. Fossils were the petrified remains of 
organisms which had once lived. The origin of these organisms was not part of 
the meaning of the term. Similarly, both Kelvin and Darwin were talking about 
the same phenomenon when they disagreed about the age of the earth. Perhaps 
some terms connected with a scientific theory are theory-laden, but according to 
‘Ruse, competing theories share a sufficient number of terms to permit the 
comparison of one theory with anothes. I agree, but Ruse presupposes a view of 
scientific theories at variance with the logical empiricist interpretation of theories 
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as highly formalised axiomatic systems. Actual scientific theories are presented. 
so informally that it is often difficult to decide which phenomena a single 
scientific theory entails and which not, let alone attempting to compare two 
different scientific theories in this respect. But logical empiricists are rarely 
impressed with facts about actual scientific theories and practices. The ‘theories’ 
which they are talking, about are their own rational reconstructions. Perhaps 
scientific theories as actually presented by scientists can be made to conflict with 
each other in a crude, informal manner, but the story is quite different for 
perfectly explicit, completely axiomatised rational reconstructions. 


3 OPERATIONISM IN BIOLOGICAL CLASSIFICATION 


A second controversy in biology for which the conceptual tools of the logicak 
empiricist analysis of science are at least initially helpful is the recent dispute 
between ‘evolutionary’ taxqnomists and their ‘pheneticist’ critics. The dispute 
centres around the role of theory, particularly evolutionary theory, in the con- 
struction of biological classifications and the need for science to be ‘operational’. 
During the course of the dispute, three overlapping but conceptually distinct 
positions have materialised concerning the goals of evolutionary taxonomy. 
` Although Ernest Mayr and G. G. Simpson have tended to present a united front 
to their opponents, Ruse distinguishes quite perceptively between Mayr’s 
genetical view and Simpson’s more genealogical position. Mayr emphasises the 
evolutionary process itself, Simpson the resulting phylogeny. The difference 
between the two approaches can be seen in Mayr’s emphasis on gene exchange 
and reproductive isolation in his species concept and Simpson’s explicit concern 
with the ancestor-descendant relation. Ruse also mentions but does not discuss 
a third school of evolutionary taxonomy which was just emerging when he was 
writing his book—the cladistic school of Willi Hennig. Like Simpson, Hennig 
proposes that classifications have a systematic relation to phylogeny, but the 
relation visualised by Hennig is a good deal more precise and invariant than the 
more flexible relation suggested by Simpson. 

As different as these three schools are in detail, they share a common commit- 
ment to evolutionary theory, a commitment not shared by their phenetic 
opponents. In their earliest writings, the pheneticists seemed to be proposing 
that taxonomists construct a single, general purpose classification by clustering 
organisms into taxa on the basis of equally-weighted traits according to their 
‘overall similarity’. Such a phenetic classification, they claimed, was prior to all 
other classifications constructed for special purposes and could be used by all 
scientists regardless of their theoretical interests. The current view of the 
pheneticists seems to be that the notion of a single measure of overall similarity 
‘was a metaphysical delusion, that the weighting of traits is perfectly acceptable 
as long as it is explicit, objective and expressed quantitatively, and that all 
classifications are equally special purpose. The major advantage which phenetic 
classifications have over their evolutionary competitors is that they are more 
objective, quantitative, operational, etc. 

The objections which Ruse raises to the claims of the pheneticists : are well- 
taken. By now the shortcomings of extreme empiricism and operationalism are 
commonplace among philosophers. In the two chapters which he devotes to 
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taxonomy, Ruse discusses just the right issues and suggests reasonable com- 
promises between the two sides. However, like most philosophers, and I include 
myself in this number, Ruse is not very sensitive to the practical aspects of 
science. If palaeontologists can, in principle, reconstruct on occasion highly 
warranted phylogenies, then we as philosophers are content and feel justified in 
dismissing the pragmatic objections which have been raiséd to such reconstruc- 
tions. The scientist critics argue, however, that too few phylogenies are sufficiently 
warranted to justify attempting to classify all organisms phylogenetically. 
Perhaps there is nothing wrong in principle with the various classificatory 
programmes set out by the evolutionists, but in practice they will rarely be 
actually applicable. Heuristics also plays a significant role in science, a role which 
the principles of logical empiricism tends to obscure. 


4 SMART ON LAWS IN BIOLOGY 


If there is a bête noire in Ruse’s book, it is J. J. C. Smart (see also Ruse [19708]). 
In a pair of books ([1963], [rg68]) Smart presents a battery of arguments 
purporting to show that biology contains nothing in the way of genuine 
scientific laws or theories, or at least that none of the genuine laws in biology are 
irreducible to those of physics and chemistry. Because Smart is not very careful 
in keeping his weaker and stronger conclusions separate, it is often difficult to ` 
criticise his various arguments. Too often Smart feels obligated to dismiss a 
biological generalisation as not being a genuine law of nature when all he is 
really obligated to show is that it is reducible to the laws of physics and chemistry. 
His exposition is further complicated by the fact that he is not especially interested 
in biology for its own sake but only as an impediment to his materialistic solution 
to the mind-body problem. He is primarily interested in arguing that there are 
no genuine irreducible laws in psychology, only incidentally that the same story 
holds for biology. In general, Smart argues that those statements which biologists 
try to pass off as laws are either singular statements concerning spatiotemporally 
localised individuals, parts of historical narratives, tautologies, propositions akin 
to those in engineering, or derivable in principle from the laws of physics and 
chemistry. 

Ruse deals with each of Smart’s objections, sometimes by showing that the 
biology is mistaken, sometimes by liberalising the notion of scientific law, but 
usually by admitting that Smart’s claims about biology are true, adding that the 
same situation exists in physics. Ruse does not, however, object to Smart’s 
weaker claim. He agrees that in principle the laws of biology are derivable from 
those of physics and chemistry (see later discussion). Of all the areas of disagree- 
ment between Ruse and Smart, I will deal with only one here—the possibility of 
the names of biological species appearing in genuine laws of nature. 

Smart believes that the claim that ‘albinotic mice always breed true’ exemplifies 
an important class of biological generalisations which purport to be scientific 
laws. He claims, to the contrary, that either they are singular statements or we 
have no reason to suppose them true ([1963], pp. 53-4). If, on the one hand, the 
names of biological species are defined by their place in the evolutionary tree, 
then biological species are spatiotemporally localised individuals and any state- 
ments which make referencé to them are singular. If, on the other hand, the 
names of biological species are defined by sets of non-genealogical traits, then 
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biological species are genuine classes and statements which refer to them might 
count as scientific laws—if we had any reason to suppose that they are true— 
which we do not. As long as species are individuated on the basis of descent, the 
evolutionary process guarantees that the traits «which characterise individual 
organisms belonging to the same species should be correlated in some way or 
other. If descent is ignored and species defined in terms of particular sets of 
traits, there is no reason to suppose that any other trait or set of traits will covary 
with the set of defining traits. 

Ruse responds to Smart’s contention in several ways. First, he observes that 
there is often a significant overlap in the extensions of species defined in terms 
of descent and species defined in terms of the statistical covariation of non- 
genealogical traits. In fact, the traits covary the way they do because of descent. 
Elsewhere I ([1970], [19746]) have argued that I do not find this particular , 
argument too convincing. In cases of polymorphic species, the sets of traits which 
characterise the morphs can be quite disjoint. Ruse further counters Smart’s 
objections by distinguishing ‘fundamental’ and ‘derivative’ laws. Fundamental 
laws, like Newton’s laws, are spatiotemporally unrestricted, while derivative 
laws ‘make reference to a place or time and are derived from fundamental laws 
together with other assumptions (as are Kepler’s laws)’. But unless some require- 

_ ment of generality is retained, every singular statement derivable from a law of 
nature itself becomes a law of nature. I agree with Smart that all statements 
which contain uneliminable reference to genealogically defined species are 
singular, and that singular statements, as important as they may be in other 
respects, cannot count as laws of nature. 

But regardless of what one thinks of Ruse’s first two rejoinders to Smart, his 
third is sufficient to nullify any force which Smart’s argument might have had. 
Whether or not one views species as spatiotemporally localised individuals, 
whether or not one thinks that laws of nature can make uneliminable reference 
to such individuals, the status of the biological laws at issue is unaffected because 
none of them make reference to particular species anyway. Smart thinks that the 
statement ‘albinotic mice always breed true’ is characteristic of a wide spectrum 
of so-called biological laws, but none of the currently accepted biological theories 
refer to mice or any other biological species. Mendelian genetics refers to 
recessive genes, epistatic genes, crossover, etc., not to ‘mice genes’. Population 
genetics and various versions of evolutionary theory refer to such things as 
primitive species, geographically isolated populations, heterosis, and genetic 
equilibrium, not to Mus musculus (Hull [19744], [1975], [1976]; Munson [1976]). 

Smart raises other objections to any of the statements found in biology being 
genuine laws of nature—like ‘the survival of the fittest’, they are tautologies, or 
like ‘mammals arose from reptiles’, they are parts of purely descriptive historical 
narratives, or they belong with the statements of engineering. Thomas Goudge 
[1961] has raised similar points hut has reached the opposite conclusion. For 
example, he gladly admits that many statements in biology are parts of historical 
narratives or concern the organisation of particular organisms (the way that 
statements in engineering concern the organisation of radios), but then he defends 
biology by attributing explanatory force to such statements. Ruse disagrées with 
both Smart and Goudge. He has no patience with Goudge’s narrative and 
integrating ‘explanations’. A proffered explanation is a genuine explanation to 
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the extent that it approaches the covering-law ideal. But in opposition to Smart, 
he argues that even after all the tautologies, purely descriptive historical nar- 
ratives, and statements concerning the structure of organisms are deleted from 
biology, a significant number of biological laws remain. 


5 REDUCTIONISM IN BIOLOGY . 


The main thrust of Ruse’s book is that Mendelian genetics, population genetics 
and the synthetic theory of evolution are all ideally hypothetico-deductive in 
form, that population genetics is derivable from Mendelian genetics, and that 
population genetics is the core of the synthetic theory of evolution. Ruse com- 
pletes his reductionist programme by arguing in his final chapter, that Mendelian 
genetics is reducible to molecular biology. Thus, it would seem to follow that, 
ideally, molecular biology forms the core of the synthetic theory of evolution. 
On closer scrutiny, however, Ruse’s reductionism is not as robust as it may at 
first appear. For example, his claim that ‘not only} can genetics be axiomatised, 
but that it is în fact axiomatised (in a perfectly unobjectionable manner)’ is 
weakened considerably when one realises the casual notion Ruse has of axio- 
matisation. “To say that something is axiomatised is to say that we start with 
some statements as premises (in science, these statements are laws together with 
logical and mathematical truths), and from these we derive other statements’ ` 
(p. 33). But as Ruse is aware, the logical empiricist position on reduction was 
designed with a much more rigorous sense of axiomatisation in mind. In fact, 
Ruse has ‘no time’ for the Woodgerian ‘axiomatise before-all-else’ school and 
dismisses all previous attempts to axiomatise biology in any strict sense of the 
term. For example, he remarks that Williams ([1970]) ‘succeeds in her axiomatisa- 
tion of evolutionary theory only by avoiding all mention of genetics!’ (p. 50). 
Elsewhere he ([1975]) has noted that Woodger ([1952]) succeeds in his partial 
axiomatisation of Mendelian genetics only by avoiding all mention of genes! 

Ruse does not present anything in the way of a rigorous axiomatisation in his 
book. Instead he sketches the general outlines of a few representative theories 
and shows how one of the basic principles of population genetics (the Hardy- 
Weinberg law) can be derived from Mendel’s first law in conjunction with a 
variety of other biological assumptions (pp. 35-6). He claims even less for the 
synthetic theory of evolution and its relation to population genetics. Population 
genetics is the ‘core’ of the synthetic theory in the sense that it ‘ts presupposed by 
all other evolutionary studies’ (p. 48). Thus, though the ‘most vital part of the 
theory is axiomatised,...it cannot be denied that the whole theory does not 
possess the deductive completeness possessed, say, by Newtonian mechanics... 
Hence, at best one can say that the evolutionists have the hypothetic-deductive 
model as an ideal in some sense—they are far from having it as a realised 
actuality’ (p. 49). Ruse reaches a similar conclusion with respect to the relation 
between Mendelian and molecular genetics: ` 


As a consequence, what I would suggest is that probably we do here have 
a situation akin to that described by Nagel, although not even genetics is so 
rigorously formalised as Nagel supposes every science to be and I must 
confess that strictly speaking I think the present situation is as much that 
there are now no theoretical barriers in the way of a Nagelian-type reduction 
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and that there are obvious signposts about how this should be done, as that 
„such a reduction has been rigorously accomplished (p. 207). 


Ruse’s belief that popylation genetics is the core of the synthetic theory of 
evolution has received the greatest attention from biologists. Taking Ruse to 
mean that population genetics as it now exists in the biological literature is the 
core of the synthetic théory, they complain that it is not up to the task. For 
example, biologists are still unable to explain the prevalence of something as 
fundamental as sexual reproduction on the principles of population genetics 
(Williams [1975]). At least one philosopher (Hull [1973], [19745]) has raised 
parallel objections to Ruse’s claim that Mendelian genetics is derivable from 
molecular biology. Where are the axiomatisations of the relevant theories? 
Where are the reduction functions? Where are the derivations? All that has been , 
produced thus far are vague gestures and even vaguer promises (see the sym- 
posium on reduction in genetics in Cohen and Michalos [1976)). 

Perhaps scientific theoriesecan be interpreted with some profit as axiomatic 
systems and theory reduction as the derivation of the axioms of one such 
axiomatisation from another. I have no emotional aversion whatsoever to the 
reductionist research programme. But one should not confuse the casual though 
reasonable story which Ruse tells with the rigorous ideal initially proposed by 
“the logical empiricists. However, I suspect that even if and when the most 
exacting requirements of the logical empiricist analysis have been fulfilled, much 
that is important about scientific theories and the relation between them will 
have been missed. As Ruse remarks: 


It is perhaps important to emphasise that no actual deduction from a 
purely molecular theory to a purely biological theory seems yet in existence. 
Indeed, reading the works of biologists one gets the feeling that the whole 
question of reduction is more of a philosopher’s problem than a scientist’s 


(p. 207). 


6 CONCLUSION 


Thus far a half dozen books have been written in the logical empiricist tradition 
on the philosophy of biology. Several have argued that biological phenomena 
require the expansion (Beckner [1959]) or major modification (Goudge [1961], 
Hull [19742]) of the logical empiricist analysis of science, but just as many have 
indicated no grave misgivings about its adequacy at least in principle (Simon 
[1971], van der Steen [1973], Ruse [1973]). Of these works, Ruse’s book has 
much to recommend it. It is more comprehensive and comprehensible than 
Beckner and Hull. It delves into the issues raised by biological phenomena in 
greater depth than either Goudge or Simon. And van der Steen’s book is in 
Dutch. Thus, if I were to choose a single text for an upper level undergraduate 
course in philosophy of biology, it would be Ruse’s The Philoscphy of Biology. 
It presents the best case possible for the general applicability of the logical 
empiricist analysis of science to biology. 


DAVID L. HULL 
The University of Wisconsin, Milwaukee 
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X INTRODUCTION 


Wesley Salmon’s latest book consists of four essays: the first is on the develop-- 
ment of Euclidean and non-Euclidean geometries; the second deals with Zeno’s 
paradoxes, their mathematical elucidation and physical relevance; the third 
. provides an introduction to Special Relativity, leading to a discussion of the 
problems of its interpretation in the last essay. Between them they certainly 
provide an excellent and clear exposition of many of the important problems 
about space, time and motion which are of topical concern to physicists and 
philosophers. 

Salmon presents these topics with sure authority which embraces mathematics 


*Review of Wesley C. Salmon ‘[x975]: Space, Time and Motion, A Philosophical Intro- 
duction. California: Dickenson Publishing Company Inc. Pp. 149 
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and physics as well as philosophy. The book serves as an admirable introduction 
to these topics, and provides perspectives on geometry and (mathematical) 
andlysis which could be invaluable to students of mathematics and its philosophy. 
It also contains an Epilogue, ten pages of very useful notes and references, and 
a comprehensive index. R 

Yet the book is muth more than a well-written ‘introduction’; it is a thought- 
provoking discussion of fundamental issues and, as such, could also be in- 
valuable to the expositors and practitioners of mathematics, physics and 
philosophy. 

In the following I will discuss some of the more novel and contentious 
matters raised by Salmon. 


2 PURE AND APPLIED GEOMETRY 


Chapter 1 deals with the interactions and associations between geometry and 
philosophy throughout the ages. Practical geometry is older than the pyramids, 
but pure geometry, based on abstract and more-or-less explicit definitions and 
assumptions, is no older than Greek philosophy. For over two millennia, Greek 
geometry, as systematised by Euclid, was the cornerstone of mathematics and 
of natural philosophy, and it was only in the nineteenth century that daring 
` mathematicians like Bolyai, Lobachewski and Riemann proposed that consistent 
alternatives to Euclid existed. Salmon raises the question ‘Which of these 
geometries correctly describes the physical space of our universe?’, and after 
a discussion of different viewpoints he concludes (in agreement with Riemann 
and Reichenbach) that space has no intrinsic metric and that the geometry 
applicable to phenomena in physical space depends on our definition of 
‘congruence’, that is on how we define equality of spatial intervals when these 
intervals are located at different places. 

Pure geometry has a criterion of correctness in terms of consistency and its 
relation to other geometrical systems. But geometry applied for the purpose 
of déscribing the world has on the one hand a conventional basis (depending 
on our measurement definitions), and on the other hand an empirical basis 
depending on our observations and physical experiments on the behaviour of 
light rays and solid rods, etc. 

Thus Salmon sets the tone for the’rest of his book in his distinction between 
mathematics as an art and as a branch of knowledge in its own right, and 
mathematics as a flexible tool which can help us to reveal one aspect of our world. 


3 MATHEMATICAL AND PHYSICAL TIME 


Zeno proposed his challenging paradoxes to support his attitude that reality 
must be considered as an unchanging motionless whole, that motion is an illusion 
whether space and time are continuous or atomistic. Salmon presents four of 
Zeno’s paradoxes of motion and shows how the first two (involving the 
frustrations of Achilles) can be resolved mathematically by employing the modern 
approach to sequences, series and the concept of the limit of an infinite sequence 
whose clarification started with Cauchy and has developed ever since. Students 
usually find this a tortuous topic which is perhaps not surprising when one 
considers the trouble that great mathematicians had, to give it a rigorous 
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formulation. However, by showing its power and historical raison d’être and by ~ 


presenting it in the context of a problem, Salmon makes the mathematics an 
exciting revelation for students and a pleasure of deeper understanding for 
their teachers. d . 

He repeats the performançe by resolving the paradox of the arrow (in motion 
or at rest at any instant?) in terms of the modern jhtory of functions, the 
conditions for their continuity which depends on the limit concept, and the 
existence and significance of their derivatives at points of continuity. Again 
we can all benefit from the mathematical and philosophical issues which he 
handles so elegantly and effectively. 

Finally he confronts the paradox of plurality—how many points of zero 
magnitude in a finite line segment? Here he introduces us to modern set theory, 
. the new number concepts invented by Georg Cantor and the ideas of measure 
theory. In this way he is able to suggest that even this problem has a valid 
mathematical-logical solution, albeit in terms of a highly-sophisticated theory 
in which he arouses our curiosity and interest. ° 

However, once again we need to distinguish between the mathematical 
solution of theoretical problems and the actual nature of space, time and motion 
in our physical world. Salmon believes that the latter is an empirical question 


depending on observation and experiment; and that the apparent nature of. 


space and time will depend also on our definitions of congruence—for time- 
intervals as well as for space-intervals. He states that “Given that motion is a 
fact of the physical world, it seems to me a further empirical question whether 
it is continuous or not, but I do not think it can be answered a priori.” I would 
agree with this statement, but I further believe that the answer to the empirical 
question is implied by our very conception of time. Time, after all, is an 
idealisation—a human concept associated with and based on our consciousness 
of the succession of events, of change and of motion. On the one hand we have 
the mathematical expression of this concept as a continuum of instants, on the 
other we have its physical expression which is reflected in our consciousness, 
as a succession of events each of which always has a definite duration. Thus in 
the spirit of Salmon’s approach we must clearly distinguish between the idealised 
(and mathematically very useful) instant—a location in time without duration— 
and the physical instant associated with an event which always has some 
duration. ' 

This approach differs of course from proposals (see e.g. Zwart [1976]) which 
suggest that time as well as space exist in the form of quanta with ‘blurred’ 
edges due to the uncertainty principle. Our consciousness of events depends on 
their nature: we can ‘see’ a flash of lightning as if all the points of the moving 
light were occupied simultaneously, the duration of the ‘present’ may be assoc- 
iated with a whole sentence, a phrase of music or a tick of the clock; the ‘present’ 
always has duration so that we can see the movement of a ball passed between 
two players and can only obtain the separate space-and-time locations of the 
ball by a photograph which ‘traps’ an instant. But these instants are only 
still-lifes, not the real world at all. 

I am not suggesting that our consciousness of events as having definite 
durations solves all the paradoxes and problems associated with the concept 
of time, only that these problems ffow from the way we have conceptualised 
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and employed the notion of time. Time is a concept invented by man to help 
him explain the world of change and motion around him. It is important that 
this concept and its mathematical aspects be free of internal contradictions and 
paradoxes. As Salmon hås shown, modern matlfematics has gone a long way 
towards resolving problems about it as raised by Zeno and others. However, 
these problems and their solutions are logical and mathematical and have little 
bearing on the world of real events involving non-zero durations and continuous 
movements. 


4 THE ABSOLUTE EFFECTS OF SPECIAL RELATIVITY 


In his third essay Salmon introduces the reader to the basic assumptions and 
relationships of Special Relativity. With his characteristic authority and clarity, 
he tackles the key concept of the relativity of simultaneity by comparing the 
observations of a man on a,fast-moving train (with velocity 0-6c) with those 
of a man on an embankment beside the track. He shows that the disagreement 
about simultaneity by the two observers is a consequence of each assuming that 
light is isotropic with respect to his own frame, when in fact this is not so. 
Einstein employed a similar demonstration in his 1905 paper, and it is clear 
‘that if light propagates isotropically with respect to one frame then it cannot 
do so with respect to any other. This leads naturally to different criteria of 
simultaneity and of synchronous clocks for any pair of inertial reference frames 
if every observer assumes the isotropy of light with respect to his own frame. 

Salmon presents this result on a Minkowski diagram, emphasises its reciprocity 
aspect and then derives the consequences of employing a light clock—a rod 
with a mirror at each end reflecting a light pulse to and fro—held at right-angles 
to the direction of relative motion. He shows that such a clock must register 
longer units of time when moving than when it is stationary with respect to 
a frame in which light propagation is isotropic. This, then, is the source of the 
time dilatation effect, and it is also reciprocally observed since every inertial 
observer can assume that light propagates isotropically with respect to his own 
reference frame. Salmon’s treatment of this effect seems at first to imply that 
it only obtains for light clocks at right-angles to the relative motion; indeed 
it obtains for any direction of the clock providing we also recognise the existence 
of an associated (reciprocally-observed) effect, namely the length-contraction 
consequence. Salmon shows that this follows from the relativity of simultaneity 
and is rather ambivalent, as are most authorities on this topic, whether the 
effect has any absolute basis or not. 

The problem of whether this relativistic theory is consistent with the 
emergence of an absolute effect is tackled by Salmon in his last essay. He con- 
siders that this effect has been amply observationally confirmed and particularly 
so in the light of the recent Hafele~Keating experiment involving atomic clocks 
circling the earth in opposite directions. He also insists that the problem of how 
a reciprocally-observed phenomenon can be consistent with the emergence of 
an absolute effect (‘the twin paradox’) is one which must be resolved within the 
framework of Special Relativity itself; it is no solution to explain it in terms of 
another theory and invoking acceleration, effects, particularly since the effect 
also arises in a formulation of journeys free of accelerations, (And here again 
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I must applaud Salmon for the force of his logical argument and for his willing- 
ness and capacity to confront the challenge of the problem which has so aoa 
been side-stepped or evaded.) 

Salmon analyses such a journey, originally descriked by Lord Bibus and 
Hermann Bondi [1957], and, shows how the paradox is formally exactly resolved 
by taking into consideration tbe change in viewpoint sof synchronous clocks 
when an observer transfers from one inertial system to another. This change 
in viewpoint is entirely independent of the manner of transfer and of any 
separate effects due to possible accelerations. It is remarkable that this logical 
resolution of the apparent paradox has remained virtually unnoticed while 
the controversy around it continues unabated with the airing of false arguments 
on all sides, for this approach was proposed by Griinbaum as long ago as 1954 
and outlined in detail years ago (e.g. Prokhovnik [1967]). One hopes that 
Salmon’s clear demonstration of the non-existence of the paradox will at last 
lay its ghost to rest, so that more important issues are discussed. 

For Salmon the important issues which arise out of his analysis centre around 
the meaning of the relativity of simultaneity which he sees as ‘the key to the 
entire special theory of relativity. He considers that the clue to this key lies 
in our understanding of the velocity of light and the manner by which we 
measure it. A point-to-point measure requires the use of synchronised clocks. 
and since the process of synchronism involves reflecting light-signals and an 
assumption about their velocity, we find at every turn that we cannot avoid 
an assumption about light-velocity in order to determine it. He discusses at 
length the conventional character of the synchronism definition whereby an 
observer assumes that if he dispatches a light-signal at time ¢, so that it returns 
to him at time ż, then it is reflected at a distant object or clock at time tą 


where ta = (441) = ht+e(ts—h), 
where here e= 


It is seen that t, is calculated, according to Einstein, on the tacit assumption 
that the velocity of the signal is the same in both directions. There has been 
a considerable literature in the past decade which has challenged this convention 
as arbitrary and considered the consequences of alternative values of e. Salmon 
discusses these proposals at length but concludes only that in view of its success 
and theoretical elegance Einstein’s choice of e = $ is a ‘strikingly non-trivial’ 
convention. 

I believe that in this final section, Salmon has espoused a very prominent 
red herring, or at least turned into a tempting cul-de-sac. Of course, this 
e-excursion is interesting and has led to some important results, for example 
by Winnie [1970] regarding the agreement between slow transport and standard- 
signal synchrony. But the logic of the matter implies quite clearly, in agreement 
with Salmon’s conclusion, that Einstein’s convention with e =} is the only 
possible one which is consistent with his empirically-based Light and Relativity 
Principles. So that it is not surprising that his convention is strikingly non- 
trivial—it is the only one which could lead to a self-consistent theory. 

As Salmon himself suggests, we must look to the physical world for an 
explanation of the factual content of a theory. The vital consequence of Salmon’s 
resolution of the clock paradox is*the realisation that both observation and 
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theory confirm the existence of an absolute effect associated with uniform motion. 
I believe, with Builder [19584], that such motion must then have some absolute 
connotation which gives rise to the absolute effect. Builder showed that the 
assumption of a fundamental reference frame for light propagation renders 
intelligible the relativity of simultaneity, the associated length contraction and 
time dilatation phenorfiepa, and that this leads in turn (Builder [19585]) to the 
constancy of the measure of the velocity of light for every inertial observer in 
respect to his own reference frame. 

This single assumption (of a fundamental reference frame) is sufficient to 
develop the whole theory (see Prokhovnik [1973]). Movement with respect to 
this frame is associated with anisotropy (of light, more generally of energy) 
effects which produce the absolute effects as well as the observational differences 
(e.g. relativity of simultaneity) when all observers employ Einstein’s assumptions . 
and conventions, In this context, the reciprocity of observations emerges as a 
result of the interplay of the absolute and observational effects acting in opposite 
directions. The complex of dil these effects is expressed by the Lorentz trans- 
formation (see Prokhovnik [1967]). 

In 1905 there was no basis for observing or defining a fundamental reference 
frame, since the theory implied that no local phenomenon or experiment could 
„distinguish any such frame. However, modern astronomy and cosmology have 
disclosed that our universe consists of an ensemble of galaxies bathed in a sea 
of radiation. As suggested by Bondi [1962] and Bergmann [1970], this knowledge 
provides us with two criteria for defining a fundamental reference frame 
which, many would consider, violates the Principle of Relativity. 

But in fact it does not; instead it elevates Special Relativity into a cosmological 
theory and vindicates the logic of the theory so well drawn out in Salmon’s 
exposition. 

My disagreements with Salmon do not indicate that I find his ideas un- 
interesting. Quite the opposite—his approach not only teaches the reader a lot, 
it alsg provokes him to think for himself and challenges him to develop the 
argument further. 


5. J. PROKHOVNIKE 
University of New South Wales 
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Mach, Einstein,,and the Rise of Modern 
Science a 


by ELIE ZAHAR 


The specific problem to which this article is addressed is: did Mach’s 
philosophy of science play a significant role in the genesis of Relativity 
Theory? The thesis that it did is widely held and was recently given a sharp 


‘formulation by Kenneth Schaffner in his [1974] article. This problem is 


part of a much wider issue: did positivism exert a positive influence on 
the progress of modern science? Or a baneful influence? Or practically 
no influence at all? There is a widely held view, voiced mainly by Born and 
Bridgman, that the modern scientific revolution is the outcome of a ° 
positiyistic revolution in the philosophy of science. Against this view, I 
propose to show that positivism was largely irrelevant to the development 


` of modern physics; while paying lip-service to Machian positivism, 


’ 


scientists like Einstein remained old-fashioned realists. Had Einstein 
really adhered to the tenets of Machism, then Special Relativity would 
never have seen the light of day. The operationalist approach to scientific 
concepts, which supposedly revolutionised modern physics, is nothing 
more, in Einstein’s case, than the proposal to treat certain definitjons 
nominalistically. This vindicates Popper’s view? that such nominal- 
stipulative definitions are empirically empty and hence make no difference 
as to the theories in which they are imbedded. 

Let me now go back to Schaffner’s article in which he criticised my 
[1973] paper as follows: 


Zahar’s account of both the genesis and the reason for the acceptance of Ein- 
stein’s theory says essentially nothing about the reanalysis of the time concept 
and the redefinition of simultaneity. This is in marked contrast to how Einstein’s 
contemporaries felt, and it is in part due to this omission of Zahar’s that I 
believe his perspective on both the discovery of special relativity and the super- 
sedure of Lorentz’s theory of the electrodynamics of moving bodies is fatally 
flawed . . .. A light signal synchronisation procedure could be introduced so 
that, taking two distant points A and B, we ‘establish by definition that the time 
required by light to travel from A to B equals the time it requires to travel from 
Received 3 May 1976 


1 See below pp. 198~9. 3 Popper [1945], vol. 2, p. 14. 
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B to A.’ (Einstein’s italics in his [1905]). A definitional component in the re- 
construction of the concept was required because the velocity of light in a one 
way direction cannot be ascertained in any empirical manner prior to a deter- 
mination of distant simuktaneity. Since this is an Finsteinian and not a Newtonian 
world, such a definition is admissible . . . I myself believe that the admissibility 
of Einstein’s definition depends on the hypothesis of holding the independence 
from its source of the velocity of light... 

Einstein has noted . . . that the philosophical writing of Hume and Mach 
assisted him in his analysis of the concept of time . . . there is a methodological 
approach to the reanalysis of fundamental scientific concepts which one finds 
even in the early editions of Mach’s Science of Mechanics which could not have 
failed to make some impression on Einstein, and which may well have served 
as an unconscious methodological guide for Einstein in his 1905 reanalysis of 
simultaneity. 


For the purpose of my*further discussion I should like to separate out 
two claims which Schaffner makes (both these claims, as we shall see, 
have been made by other authors like Bridgman): 


. (x) A Machian analysis of the concepts of time and simultaneity played 
an important part in the discovery of Relativity. 

(2) Einstein’s definitions of simultaneity would be inadmissible in a 
Newtonian world. 


These claims seem to be supported by the following passage from 
Einstein’s article ‘Ernst Mach’ which was published in 1916 shortly after 
Mach’s death: 


The reader has already guessed that I am here alluding to certain concepts 
of the theories of space, time and mechanics which have undergone modifications 
through the Theory of Relativity. Nobody can take away from the epistemologists 
the merit of having helped the new developments in these areas; from my own 
experience I know that I have been, directly or indirectly, aided by epistemo- 
logists, especially by Hume and Magh... 

The quoted lines [from Mach’s Mechanics] show that Mach clearly recognised 
the weak spots in Classical Mechanics and was not far from requiring a General 
Theory of Relativity, all this about half a century ago! It is not improbable that 
Mach would have come across Relativity Theory if, at the time when he was in 
his prime, physicists had concerned themselves with the significance of the 
constancy of the speed of light. 

In the absence of this stimulus which came later from Maxwell’s and Lorentz’s 
Electrodynamics, Mach’s critical-epistemological needs were insufficient to 
arouse a feeling for the necessity of a definition for the simultaneity of distant 
events.+ 


In view of this quotation, it seems pointless to attack the non-problem 
of whether Machism exerted an important influence on Einstein’s thinking; 


1 Heller [1960], pp. 153-6. My translation. 
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for Einstein himself ‘admitted that it did and, after all, Einstein ought to 
know. However, the matter is not as simple as that. 

Let me immediately say that I do not intend to examine the question 
whether, as a matter of psychological fact, Mach’s philosophy was present 
to Einstein’s mind when Einstein constructed his theory of Relativity. 
There is absolutely no reason to doubt Einstein’s own statement that he 
was under Mach’s direct or indirect influence at the time when he was 
developing Relativity Theory. My only concern is with the question 
whether there exists any objective connection between Mach’s philosophy 
on the one hand, and the Theory of Relativity as proposed by Einstein on 
the other. It is well known that scientists often make mistakes when giving 
accounts of their own intuitive methodologies. Newton may have sincerely 
believed that he had induced his theories from the facts; but it is obvious 
that the facts alone—+.e. the facts without theoretical admixtures—do 
not imply say the Absolute Time Hypothesis. Similarly, Einstein may have 
been mistaken in thinking that Machism helped him to discover Relativity. 
The Machian elements of Einstein’s thought may have been irrelevant . 
to this discovery, or—more strikingly still—Einstein may even have used 
an approach alien to Machism. This latter possibility is clearly indicated 
by Mach’s own rejection of Relativity. In the Introduction to his book 
on the history of optics Mach wrote: 


I gather from the publications which have reached me, and especially from my 
correspondence, that I am gradually becoming regarded as the forerunner of 
relativity. I am able even now to picture approximately what new expositions 
and interpretations many of the ideas expressed in my book on Mechanics will 
receive in the future from the point of view of relativity . . . I must, however, 
as assuredly disclaim to be a forerunner of the relativists as I withhold from the 
atomistic belief of the present day. 

The reason why, and the extent to which, I discredit the present-day relativity 
theory, which I find to be growing more and more dogmatical, together with the 
particular reasons which have led me to such a view—considerations based 
on the physiology of the senses, the theoretical ideas and, above all, the con- 
ceptions resulting from my experiments—must remain to be treated in the 
sequel,t 
Given these contradictory accounts of the relationship between Machism 
on the one hand and the epistemological foundations of Relativity on the 
other, I have decided to follow Einstein’s own advice, namely to investigate 
what Mach and Einstein actually did and not what they said they 
did. 

Before embarking on this investigation let me point out that the problem 
tackled in this paper transcends its connection with a particular historical 


1 Mach [1913], Preface. 
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episode, namely the emergence of Relativity Theory. Mach’s philosophy 
is strictly positivistic. It has been maintained—by Bridgman and Born 
among others—that Special Relativity is the paradigm of a physical theory 
based on the positivistfc requirement that all scientific concepts should be 
operationally defined. According to Bridgmant: | 


The important step that Einstein made was that in analysing the connection 
of the equations with the theory he was led to examine the details of what we 
do in applying the equations in any specific case. In particular, one of the 
variables in the equations was the time—what do we do in obtaining the number 
which replaces the general symbol for time when we apply the equations to a 
concrete case? As physicists we know that this number is obtained by reading 
a clock of some sort, that is, it is a number given by a prescribed physical oper‘ 
ation. Or the equations may involve the times of two different events in two differ- 
ent places, and to understand completely what is now involved we must analyse 
what we do in determining the time of two such events: Analysis shows that 
we read two clocks, one at each place. A new element now enters, because a 
complete description of all the manipulations involved demands that we set 
up some method by which we compare the two clocks with which we measure 
the two events. Out of the examination of what we do in comparing the clocks, 
we all know Einstein’s revolutionary recognition that the property of two events 
which hitherto had been unthinkingly called simultaneity involves in the doing 
a complicated sequence of physical operations which cannot be uniquely specified 
unless we specify who it is that is reading the clocks. We know that a consequence 
of this is that different observers do not always get the same results, so that 
simultaneity is not an absolute property of two events but is relative to the 
observing system... 


: Through his operational definition of coordinate time, Einstein is 
supposed to have set an example for physicists like Heisenberg and Born 
who subsequently developed Quantum Mechanics. Heisenberg allegedly 
followed Einstein’s lead when he decided to construct his new mechanics 
as a calculus of observables. In a series of letters to Einstein, where he 
expresses his puzzlement over Einstein’s rejection of the Copenhagen 
Interpretation of Quantum Mechanics, Born wrote: 


I was an unconditional follower and apostle of the young Einstein and swore 
by his theories; I could not imagine that the old Einstein thought differently. 
Einstein had based Relativity Theory on the principle that concepts which refer 
to unobservables have no place within physics: a fixed point in empty space and 
the absolute simultaneity of two distant events are such [unacceptable] concepts. 
Quantum Theory arose when Heisenberg applied this same principle to the 
electronic structure of the atom. This was a bold fundamental step which at 
once struck me as self-evident and led me to put all my powers at the service of 
this new idea. I found it obviously impossible to grasp that Einstein refused to 
accept the validity within Quantum Theory of this principle which he had very 


1 Bridgman [1936], pp. 7-8. 
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successfully used himself . . . In a previous letter he [Einstein] had expressed 
his views by saying that he was averse to the philosophy of the ‘‘esse est percipi”.2 

Thus, according to Born, we owe both Relativity Theory and Quantum 
Mechanics.to a positivistic approach which was saiccessfully initiated by 
Einstein. Since Relativity and Quantum Mechanics form the cornerstone 
of modern physics, it follows from Born’s claim that modern physics as a 
whole is the result of a positivistic revolution in philosophy. Hence, if I 
can show that, on the contrary, Special Relativity has essentially nothing 
to do with positivism, then the positivist view of the emergence of modern 
science will be strongly undermined; this is so especially because Special 
Relativity supposedly acted as a model which was emulated by scientists 
working in different fields. 

Since the thesis of Mach’s influence has been so clearly formulated by 
Schaffner, let me readdress myself to his claims*formulated above (p. 196). 
Against Schaffner’s claims I shall defend the following two theses: 


(1’) If one adopts a Machian critique of scientific concepts—as opposed 
for example to a Popperian critique of scientific proposttions—then | 
both Einstein’s definition of co-ordinate time and Special Relativity 

‘as a whole prove unacceptable on more than one count. Hence 
Mach was right in rejecting Relativity as incompatible with his 
own epistemological viewpoint. 

(2’) However, if one adopts a propositional approach to scientific 
hypotheses, then Einstein’s definition of time is clearly recognised 
for what it really is; namely a stipulative nominal definition which, 
being nearly empty, is compatible with all rival hypotheses.* 


It will be shown not only that Einstein’s definition of time is compatible 
with classical physics but also that all classical theories can be reformulated 
in terms of the time variable as determined by Einstein’s synchronisation 
convention. Needless to say, classical theories, when thus reformulated, 
assume a form far more complicated than the usual one. (‘This result is the 
obverse of Griinbaum’s.? Griinbaum showed that Einstein’s definition 
of time is the most convenient one for Special Relativity. It is hardly 
surprising that the same definition should not be the simplest one for 
classical physics.) 

It follows from (1’) and (2’) that Einstein’s analysis of the time concept 
alone (i.e. independently of certain propositions which form part of Ein- 
stein’s theory) could not have led to Special Relativity. This result supports 
Popper’s general view that propositions and not concepts play a primary 
role in the progress of science. Moreover, my claim is perfectly compatible 


1 Einstein and Born [1969], pp: 299-300. My translation. 
* Popper [1945], vol. 2, p. 14. 3 Griinbaum [1973], chapter 12. 
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with Grünbaum’s correct thesis that, in the light of Relativity Theory, 
Einstein’s particular choice of the time variable is the most convenient 
one. Taken together with Griinbaum’s thesis, my claim entails that the 
main merit of Einstein’$ definition is the purely pragmatic one of possessing 
excess symbolic simplicity over rival conventions. 

After vindicating these two theses I shall go on to argue that: 


(3’) Although the influence of Machism on Special Relativity was 
negligible, there is a sense in which the General Theory embodies 
Mach’s approach to certain physical hypotheses. We shall see that 
Mach came Very near to requiring a condition of general covariance 
for certain theories. 


This presents us with the following paradox. How is it that an extreme 
form of positivism like Machism influenced a highly speculative theory 
like General Relativity but not its forerunner, namely Special Relativity, 
which seems far more acceptable from an empiricist point of view? Special 
Relativity is a severely tested theory whose concepts appear to be suscep- 
` tible of operational definition. As opposed to this, the empirical basis of 
General Relativity is slight and the connection between the coordinates 
and any process of direct measurement—which is-an important feature of 
Special Relativity—seems largely lost. In order to resolve this paradox I 
shall distinguish between Mach’s general positivistic philosophy—hence- 
forth referred to as Machism—and the intuitive methodology which he 
used in criticising Newtonian Mechanics. It turns out that Machism played 
hardly any part in the genesis either of the Special or of the General 
Theory of Relativity; however Mach’s intuitive methodology did guide 
Einstein in the construction of his gravitational theory (i.e. General 
Relativity). 

As a preliminary to showing that it played a negligible role in the genesis 
of Relativity, I shall give a brief exposition of Machism, t.e. of Mach’s 
explicitly articulated brand of positivism. 

Mach’s ontology consists of the so-called elements which are the simple 
component parts of sensations. Colours, smells and shapes are typical 
elements of sensations. All elements are interconnected. The task of 
science is to establish relations between the elements in such a way that 
economy of thought is maximised. Mach assumed that such elements can 
be dealt with in a quantitative way.1 Let a@,, aa, ..., Am be numerical 
measures of elements of sensation, then science aims at constructing 
functions f,(21, -< -, Xm)» fa(Xp> «+ +> Xm) «= +» fo(¥4) «+ +) Xm) Such that: 

Flay «+5 am) = 0 for all i= 1, 2,...,%. (1) 
1 Mach [1906], chapters 1 and I4- 
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In general m will-be larger than n so that, given any (m—n) among the 
m quantities 44, ..., 4m equations (1) enable us to compute the remaining 
n ones, This is what Mach means by saying that one important task of 
science is the completion of facts in thought: ifewe know (m—n) facts, 


e.g. the values of a;, ..., @,-» in (1) then the equations enable us to com- 
pute am-n+1 +++) Am 1e: to complete—or extend—the set {a}, ..., @n—n} 
to the set {aj . . ., Am-n Am-n+1 + - 7 Om}: 


Mach uses this conceptual approach in order to replace the asymmetric 
relation of cause to effect by a symmetric relation: namely the relation of 
the functional interdependence between the various elements of sensation. 
Given equations (1), a change in any (m—n) among the m quantities aj, 

*. ++) Am determines a corresponding change in the remaining n quantities. 
Thus a change in any (m—n) among the m elements ay, ..., Am causes a 
change in the remaining ones. The division intb cause and effect is there- 
fore arbitrary. For example, given Boyle’s law that for a fixed mass of gas 
at constant temperature pv = k, we can say that a variation of pressure 
causes a corresponding variation of volume and vice-versa. 

Let us note that each of the m quantities a}, . . ., Am, refers to an element 
of sensation, i.e. to an observable. Thus an observable effect must have an 
observable cause, i.e. only observables can act on observables. So, for 
example, according to Mach, the oblateness of the earth is caused not by 
the earth’s acceleration in absolute space but by its rotation relatively to 
the observable stars. (This conclusion will play an important part in the 
genesis of General Relativity.) 

Let us go back to equations (1) and to the requirement that science 
should maximise economy of thought. The functions fi, fas ..., Ja must 
therefore be simple in the strictly pragmatic sense of minimising intellectual 
discomfort. For example, equations (x) should enable us to express 
Gm—nty ++ ‘Am in terms of ay, ..., @m—n With the least expenditure of effort. 
Noting that a,, ..., am are measures of elements of sensations, i.e. of 
observables, we conclude that according to Mach only the arguments of 
the f; but not the f, themselves possess ontological status. The functions fi, 
and consequently all scientific laws, exist only in our minds and are thus 
subjective or at any rate dependent on social and cultural circumstances. 
It follows that the simplicity of scientific laws has no objective status. 
This consequence of Machism will play a significant role in connection 
with the Covariance Principle. In his diary (published by Dingler) Mach 
wrote: 


[A law is] Only [the] completion of experience and applicable only as long as ex- 
perience remains the same. Confirmation and refutation. Completion. Abstraction 
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from present conditions. That which is constant, the rule+[is] an anchoring point 
for us [and] does not exist outside our thought [mind]. 

Let us now turn to Mach’s operationalist view of scientific concepts. 
Mach maintained thatedefinitions in sciencé should fulfil the following 
conditions: 


(1) The definition of a scientific concept (or quantity) should provide us 
with a concrete method for deciding whether or not the concept applies 
(in the case of physical quantities the definition should furnish a means of 
measuring the quantity in question). The meaning of a scientific concept 
is the sequence of operations—i.e. the set of sensations—which constitutes 
a decision procedure for the applicability of the concept. Griinbaum 
correctly pointed out (in his [1954]) that operationalism in this sense’ 
confuses pragmatics with semantics. However, Mach’s approach to 
definitions is evidently in line with his ontology. A concept or quantity is 
metaphysical if it is not operationally defined; hence every notion which 
refers to hidden entities—1.e. to entities inaccessible to direct observation 
_is metaphysical. One aim of mature science is to purge itself of all meta- 
physical notions. Mach implicitly assumed that science would thereby 
become more economical. He did not seriously consider the posstbility 
that the use of metaphysical concepts might prove indispensable, even in 
the long run, for maximising economy of thought. 

According to Mach, theories are metaphysical if they contain notions 
which are not operationally defined. Thus the status of theories is deter- 
mined by that of the concepts which occur in them. In this sense concepts 
play a primary role, and theories a derivative one, in Mach’s methodology. 

This conceptual—as opposed to the propositional—approach had 
another important effect on Mach’s philosophy of science. It led him to 
try to eliminate the asymmetry between cause and effect in scientific 
explanation. For example, in examining Boyle’s law, Mach concentrated 
on the concepts of pressure and volume. Since changes in p and v co- 
determine each other, either of them can be regarded as the cause and the 
other as the effect. Mach concluded that, in all cases, the cause-effect 
relation can be replaced by the symmetric relation of functional depen- 
dence. The propositional approach reinstates the asymmetry between 
cause and effect as follows. Only states of affairs can be causes of other 
states of affairs and states of affairs are described not by concepts, but, by 
propositions. Let T, I, P stand for: scientific theory, initial conditions and 
prediction respectively. Given T, J will be a cause of P iff T->(I—>P). 
Even though this relation may hold, it does not necessarily follow of course 
that T—+(P->I). The asymmetry between the cause J and the effect P is 

1 Dingler [1926], p. 100. My translation. 
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taken account of by the asymmetry of the functor-+-which connects 
propositions. This is not to deny that cause and effect may in some cases 
be interchangeable in the sense that in these cases T->(I++P) holds. 
However, such symmetry steed not obtain. Although Einstein paid lip- 
service to Mach’s view that the only task of science is to introduce order 
into our sense-perceptions'—1.e. to set up functidnal relations between 
elements of sensations—Relativity Theory is a causal theory in the tradi- 
tional sense of the word. The asymmetry between cause and effect is as 
important a feature of Relativity as it is of any classical hypothesis.* 

Apart from condition (1) above, Mach subjects definitions of scientific 
terms to a further condition: 


(2) Definitions should be free of theoretical presuppositions, that is all 
definitions should be theory-independent. 
An immediate consequence of (2) is: 


(3) The definition of a scientific concept should not depend on a theory 
which contains the concept in question. Let us call this condition Mach’s 
vicious circle principle. Let us say of two concepts A and B that A involves B ` 
if A is defined in the light of a hypothesis in which B occurs. Thus Mach’s 
vicious circle principle states that no concept should involve itself.’ 


I propose to show that, although Mach did not—and indeed could not— 
satisfy (2), i.e. although he could not free his definitions from all theoretical 
assumptions, he did manage to satisfy (3) and hence to avoid vicious circu- 
larity in some cases, e.g. in his redefinition of mass. I shall now give both 
Mach’s definition of mass and Einstein’s definition of coordinate time, 
then demonstrate that the two definitions are radically different. 

In connection with classical mechanics Mach’s intention was to elimin- 
ate, without any loss of empirical content, all metaphysical notions from 
Newtonian Theory. The main problem.is posed by Newton’s second law 
F = må = mdf jdt? which connects four distinct notions: force, mass, 
space and time. 

Since force, if treated as existing independently, is unobservable, 
Mach’s positivism enjoined him to treat the second law as a definition of 
force, Because distance and time are in principle measurable, the remaining 
problem was to give an independent operational definition of mass. 
Newton had taken mass to be the measure of the quantity of matter in a 
given body, but this was precisely the kind of metaphysical definition 
which Mach abhorred. Mach envisaged measuring mass by means of 
balance ‘scales. This meant defining mass in terms of weight. However, 


1 Einstein [1922], pp. 1-2. . « * Adler, Bazin, Schiffer [1965], chapter 7. 
3 See below, pp. 205-6. 
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weight is nothing but gravitational force and force is' mass times accelera- 
tion. We are thus caught in a vicious circle and condition (3) is violated. 
This is why Mach abandoned this approach and proposed instead the 
following solution. Take two bodies A and‘B; refer them to the frame 
determined by the stars; remove them from “all neighbouring sources of 
external interference; then define the mass ratio df A and B as the negative 
inverse ratio of the accelerations which A and B induce in each other. 


Thus: m,/m,; = —d,,/a45, where m,/m, = ratio of the mass of A to 
that of B, aş, = acceleration induced in B by A, a,, = acceleration in- 
duced in A by B. , 

Mach wrote: 


[I shall] show how I think the conception of mass can be quite scientifically 
developed. The difficulty of this conception, which is pretty generally felt, lies, 
it seems to me, in two circifmstances: (1) in the unsuitable arrangement of the 
first conceptions and theorems of mechanics; (2) in the silent passing over im- 
portant presuppositions lying at the basis of the deduction. 

Usually people define m = pj/g and again p = mg. This is either a very 
. repugnant circle, or it is necessary for one to conceive force as “pressure”. The 
latter cannot be avoided if, as is customary, statics precedes dynamics. The 
difficulty, in this case, of defining magnitude and direction of a force is well 
known. 

In that principle of Newton, which is usually placed at the head of mechanics, 
and which runs ‘‘Actioni contrariam semper et aequalem esse reactionem: sive 
corporum duorum actiones in se mutuo semper esse aequales et in partes con- 
trarias dirigi”, the “actio” is again a pressure, or the principle is again un- 
intelligible unless we already possess the conception of force and mass. But 
pressure looks very strange at the head of the quite phronomical mechanics of 
today. However this can be avoided.1 


Concerning his final definition of the mass-ratio as the negative inverse 
of the acceleration-ratio Mach maintained: 


A special difficulty seems to be still found in accepting my definition of mass. 
Streintz... has remarked in criticism of it that it is based solely upon 
gravity, although this was expressly excluded in my first formulation of 
the definition (1868). Nevertheless, this criticism is again and again put forward, 
and quite recently even by Volkmann. My definition simply takes note of the 
fact that bodies in mutual relationship, whether it be of action at a distance, so 
called, or whether rigid or elastic connections be considered, determine in one 
another changes of velocities (accelerations). More than this one does not need 
to know in order to be able to ‘form a definition with perfect assurance and 
without the fear of building on sand. It is not correct, as Höfler asserts, that this 
definition tacitly assumes one and the same force acting on both masses. It does 
not even assume the notion of force, since the latter is built up subgequently 
upon the notion of mass, and gives then the principle of action and reaction quite 


1 Mach [1989], p. 82. 
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independently and without falling into Newton’s logical error. In this arrange- 
ment, one concept is not misplaced on another which threatens to give way 
under it. . 


Thus Mach defined mass in terms of length amd time, thereby avoiding 
all vicious circularity. This does not imply (as Mach, sometimes seems to 
think it does) that he neéded no theoretical assumptions at all. He assumed 
for example that the ratio a,,/a4, is independent of the relative positions 
and velocities of A and B. However he made use of no supposition which 
involves the notion of mass. 

My intention is now to examine Einstein’s definitions of coordinate 
time. As we shall see, Einstein’s approach, far from being Machian, 
` involves a step of precisely the kind which Mach banned. Einstein starts 
his famous [1905] paper by proposing the following two postulates: 

(Px): All laws of nature assume the same form in all inertial frames. 

(P2): Light is always propagated in empty space with a definite velocity 

c which is independent of the state of motion of the emitting body. 

An immediate consequence of Pr and P2 is: 


(Q): Let A and B be any two points at rest in some inertial frame. 
Then the velocity of light from A to B is equal to its velocity from 
B to A. Hence light takes the same time to travel from A to B as 
it does to travel from B to A. 


On the basis of (Q) Einstein gives his well-known convention for clock- 
synchronisation. Let C and C” be two identical clocks which are at rest 
at two points O and P of some inertial frame J. Without loss of generality, 
suppose that O is the origin of J. Let a light signal leave O at C-time t, 
and reach P at C’-time 2’; the light beam is instantaneously reflected at P, 
then goes back to O where it arrives at C-time t}. Supposing that C and C” 
measure the coordinate times in their respective neighbourhocds, we have 
in view of (Q): t’—t, = tact’. ie. ` 

t = $(t, +t) (*) 

Note that equation (*) is operationally decidable. The two clocks C 
and C” are said to be synchronised if (*) holds for all values of t,. If C 
and C” are synchronised, the coordinate time at P is as given by the clock 
C’. Thus the definition of coordinate time is given in the light of pro- 
position Q which contains the notion of velocity. But velocity is rate of 
change of position with respect to time. We are therefore caught in the 
sort of vicious circle which offends against condition (3) above. Einstein 
thus violated one of Mach’s cardinal methodological principles. This is 


1 Mach [1893], chapter 2, x. 
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hardly surprising; the two notions of time and space àre so fundamental, 
so primitive, that it seems impossible to define either in terms of something 
mote fundamental. Mach thought that he could boil everything down to 
spatial measurements. He overlooked the fact that in expressing time as a 
function of spatial measurement (e.g. of the earth’s angle of rotation) he 
still needed the notion df simultaneity; and that simnultaneity is irreducible 
to purely spatial relations. 

There is another respect in which Special Relativity is unacceptable 
from a strictly Machian viewpoint. In his [1g05] Einstein assumes the 
existence of a set of inertial frames which are privileged over all others. 
Such frames are abstract entities, they are not determined by any relation 
they might bear to observables. Hence, if we eliminate from Newtonian ° 
Theory an idle metaphysical component, namely the Absolute Space 
Hypothesis, and speak instéad of the absolute set of inertial frames, Special 
Relativity turns out to be no less ‘absolute’ than Classical Mechanics. 
Both theories postulate sets of unobservable inertial frames; the main 
difference between them is the difference between the groups of trans- 
‘formations which map one inertial frame onto another; these are the 
Galilean group in the Newtonian case and the Lorentz group ir the 
relativistic one. In Special Relativity for example, one way of resolving 
the twin paradox is by means of the different relations which the twins 
bear to the set of inertial frames. Hence an observable effect, namely the 
retardation of one of the twins’ clock, is explained in terms of a hidden 
cause, namely the existence of a set of abstract frames. Einstein himself 
was later to object to this ‘absolute’ character of his own theory. It is 
therefore small wonder that Mach should have disowned Relativity. 

Let us return to the circularity inherent in the relativistic definition of 
coordinate time. Einstein was aware of this circularity when he wrote the 
following dialogue between himself on the one hand and an imaginary 
critic of his (popular) account of Spécial Relativity on the other: 


I am very pleased with this suggestion [1.e. with the operational definition of 
simultaneity], but for all that I cannot regard the matter as quite settled, because 
I feel constrained to raise the following objection: “Your definition would 
certainly be right if only I knew that the light by which the observer at M 
perceives the lightning flashes travel along the length AM as along the length 
BM. But an examination of this supposition would only be possible if we already 
had at our disposal the means of measuring time. It would thus appear as though 
we were moving here in a logical circle.” 

After further consideration you cast a somewhat disdainful glance at me—and 
rightly so—and you declare: “I maintain my previous definition nevertheless, 
because in reality it assumes absolutely nothing about light. There is only one 
demand to be made of the definition of sfmultaneity, that in every real case it 
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must supply us with an empirical decision as to whether or not the conception 
that has to be defined is indisputable. That light requires the same time to 
traverse the path AM as for the path BM is in reality neither a supposition nor 
a hypothesis about the physical nature of light, butea stipulation which I can 
make of my own freewill in order to arrive at a definition of simultaneity.”” 


Einstein’s answer to the objection that he levels ‘at his own definition is 
correct if one adopts a‘propositional as opposed to a conceptual stand with 
regard to scientific theories. Let the sentence ‘C is synchronised with C’ 
at C-time ¢,’—henceforth referred to as S(C, C’, ¢,;)—stand for equation 
(*) supplemented by an interpretation of f}, łą and #’. The proposition 
_ S(C, C’, t;) is empirically decidable. Thus Einstein’s definition satisfies 
condition (1) above. This leads us to the following problem. Let us weaken 
Mach’s methodological position with regard to definitions by elimi- 
nating conditions (2) and (3) and requiring only that condition (1) be 
satisfied. Is it then the case, as Schaffner claims, that, by trying to give 
a definition of time which fulfils (1), Einstein was led to his Theory of 
Relativity? 

My answer is again in the negative. This is again hardly surprising, for 
Einstein himself admits that his definition is stipulative. Such nominal- 
stipulative definitions are nearly empty, hence compatible with all rival 
hypotheses. As regards the definition of scientific terms Popper wrote: 


The scientific use of definitions . . . may be called its nominalist interpretation, 
as opposed to its Aristotelian or essentialist interpretation. In modern science, 
only nominalist definitions occur, that is shorthand symbols or labels are intro- 
duced in order to cut a long story short. And we can at once see from this that 
definitions do not play any very important part in science. For shorthand 
symbols can always, of course, be replaced by the longer expressions, the 
defining formula, for which they stand . . . Our scientific knowledge, in the sense 
in which this term may be properly used, remains entirely unaffected if we 
eliminate all definitions; the only effect is‘upon our language, which would lose, 
not precision, but merely brevity.? 


Popper’s position sounds somewhat extreme, but it is essentially correct. 
As regards Relativity, I maintain not only that Einstein’s definition of 
time is compatible with classical physics, but also that a classical physicist 
can adopt Einstein’s convention for clock synchronisation—thereby ob- 
taining a time variable ¢ different from the Galilean variable t—and 
reformulate all his theories in terms of ¢*. Evidently classical theories, 
when expressed in terms of ¢*, assume a more complicated form than when 
expressed in terms of t. 

Let us assume that we live in a ‘classical’ universe and let OXYZ be 


1 Einstein [1920], pp. 22-3. * Popper [1945], vol. 2, p. 14. 
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an absolute frame determined, say, by an ether at rest. The ether frame 
OXYZ is inertial and is such that Maxwell’s equations hold in it. Hence, 
in OXYZ, light is propagated with a constant velocity c in all directions. 
Let O'X’Y'Z’ be a Galilean frame of reference which slides with constant 
speed u along the x-axis OX of OXYZ. i 


` 





Figure 1 


As is well known, the absolute and Galilean coordinates are connected 
by the following equations: 

x = xut, y =y, zZ =z, P=t (2) 
wheré (x, y, 2, t) and (x’, y’, 2’, t') are the coordinate of the same point- 
event P relatively to the frames O and O’ respectively. 

At O’ let there be a clock C’ which measures the absolute time ¢’ = t; 
i.e. C’ is contiguously synchronised with all the ‘absolute’ clocks which C” 
passes as it moves along OX. Let C* be a clock which remains immobile 
at P with respect to the Galilean frame O’. 

We assume that C* is synchronised with C’ by means of Einstein’s 
convention. Let #* be the time variable which is obtained in this way. 
Thus a light signal leaves O’ at (absolute) time t}, arrives at P at (absolute) 
time 1, is instantaneously reflected at P and then goes back to O’ where it 
arrives at (absolute) time tą} ` 

By Einstein’s stipulation: 

t* = (titta). . 83) 
Let O'P = r and PÔ'X = 6. Thus: 


x’ =r cosl, ie. x—ut = x’ = r cos 8 (4) 
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[by equations (2)]. We also have: 
r r 
t ah a go uS t+ F {5) 


where v, and v, are the Galilean velocities of light along O’P and PO’ 
respectively (by Galilean velocity we mean rate ef change of position, 
where position and time are given by (x’, y’, 2’, t’) as expressed in (2)). 

It follows from (3) and (5) that: 


eE) 


T [0 — Ua 
= ʻł Me- 
a KEL ) 


It remains for us to compute 9, and v, in terms of r and 0. 





Figure 2 


We know that the absolute speed of light is equal to c. Thus a and Oy 
must add up to a vector whose magnitude is c. i.e. O'Q = c, i.e. (O’Q)* = c3; 
` Le. o?u? zuv cos @ = c¥; te. 
v?-+-(2u cos 0)v,;—(c?—u9) = o. (7) 
Similarly: 5 
v3 —(2u cos 0jv,—(@—u?) = 0 (8) 
where v, is the velocity of light during the return journey from P to O’ 
(0 is thereby replaced by r—8. Note that cos (r—6) = —cos 6, which 
explains the transition from (7) to (8)). 

Solving (7) and (8) for v, and v, respectively: 

v, = —u cos 0+ (u? cos? 6-+-c8—u?)i (9) 
Va = u cos 6-++-(u* cos? 04-c?—u?)t 

Substituting in (6): 

ru cos 6 (10) 


1* = te an ° 
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By (4): 
: a7 7u 
=t EET (11) 
z . = iay ° ` 
= r5 [since ¢ = 2’] 
= t—y 5 (x—ui) 
u 
=r(-a) 
where . 
"7 = wfc) (12) 


As for the spatial coordinates x*, y*, s*, which are supposed to be 
measured by means of rigid rods, we have: 
, x*t= x, ysy, at = 2’, (13) 
In conjunction with (2), we obtain: ` 
x* = x = xut, y*=y =y, Pas =z 


and 


(14) 


Equations (14) establish a one-one (linear) correspondence between 
(x*,-y*, z*, t*) and (x’, y’, 2’, #’). Hence all laws expressible in terms of 
the Galilean coordinates x’, y’, 2’, t are also expressible in terms of 
x, y, 2, t; and vice-versa, of course. 

` For a classical physicist therefore, the time variable z* determined by 
means of Einstein’s convention is just as legitimate as the Galilean vari- 
able z’. > 
1 Professor Prokhovnik pointed out to me the following interesting result. Suppose we use 
Einstein’s convention in order to synchronise our clocks in O’, but, instead of using rigid 
rods for the measurement of spatial coordinates, we simply insist that the latter must be 


so chosen that the velocity of light is equal to ¢ in all directions of the moving frame O’. 
Then, instead of (14), we obtain the following: 
x = y{x—ut), y=y, B= yx, t= y(t— x). 


It is easily verified that: 
ds +- dy? +da — edP = yY{dx? 4-dy?+dz*—cAdt'] 


CAEN 


Hence: 
[dx* + dy?4 dz? — cdi = 0] <> [dxt+dy?+de*—cidt? = o] 
A process therefore has velocity ¢ in O iff it has velocity c in O’. 
Also see Prokhovnik [1967], chapter 5. 
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We can thus conclude that the role played by Machism in the emergence 
of Special Relativity is negligible. If one adopts a strictly Machist view of 
definitions, then Einstein’s convention violates Mach’s vicious cirtle 
principle. If one adopts a' more liberal operationalist attitude, then 
Einstein’s convention still fails to distinguish between classical physics 
on the one hand and Special Relativity on the other.. Finally, Special 
Relativity, because it postulates a privileged set of inertial frames, is just 
as ‘absolute’ as classical mechanics; provided of course that the unique 
absolute frame of Newtonian Mechanics be replaced by the infinite set of 
Galilean frames. . 

In connection with Mach’s influence on the genesis of General Rela- 
‘tivity, the following quotation from his work on the ‘Conservation of 
Energy’ will be found very revealing: 


Obviously it does not matter whether we think of the earth as turning round 
on its axis, or at rest while the celestial bodies revolve round it. Geometrically 
these are exactly the same case of a relative rotation of the earth and of the 
celestial bodies with respect to one another. Only, the first representation is 
astronomically more convenient and simpler. 

But if we think of the earth as at rest and the other celestial bodies revolving 
round it, there is no flattening of the earth, no Foucaults experiment and so on, 
at least according to our usual conception of the law of inertia. Now, one can 
solve the difficulty in two ways: either all motion is absolute, or our law of 
inertia is wrongly expressed. Neumann preferred the first supposition, I, the 
second. The law of inertia must be so conceived that exactly the same thing 
results from the second supposition as from the first. By this it will be 
evident that, in its expression, regard must be paid to the masses of the 
universe.1 


Mach can be taken as requiring that a condition of general covariance 
be imposed on certain theories. He maintains that we should be able to 
explain the oblateness of the earth whether we adopt the stars or the earth 
itself as our frame of reference. He does not explicitly assert that the explana- 
tions must be the same in both cases, but some such assertion is implicit 
in the above passage. Mach must have known that classical theories can 
be referred to rotating frames, provided one introduces an inertial field 
determined e.g. by the relative motion of the earth and the stars. It was 
also known that physical laws assume a different—indeed a more compli- 
cated—form in a rotating frame. Thus, in order to make sense of the above 
quotation, i.e. in order to make it non-vacuous, we have to assume that 
Mach was suggesting that certain laws should assume the same form in all 
frames, i.e. should be generally covariant. Thus Einstein was right in 
claiming that i 


t Mach [1909], pp. 76-7. 
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Mach clearly recognised the weak spots in Classical Mechanics and was not 
far from requiring.a General Theory of Relativity, all this about half a century 


agg!t 

Einstein’s arguments, in favour of generał covariance are well-known 
and can be briefly formulated as follows. Laws have an ontological status 
over and above the observables which they correlate with one another. 
Moreover, simplicity is an objective property of scientific laws. Hence, if 
the laws assume a particularly simple form in some frames of reference, 
this would indicate that these frames are privileged over others. But, 
according to the Relativity Postulate, there exist no privileged frames. 
All laws must therefore assume the same form in all frames of reference. 
This argument makes essential use of the assumption that laws and their’ 
form possess an objective status. We have already seen that Machism 
contradicts this assumption. According to Machism, a law is nothing more 
than a computational device; both the device and its degree of convenience 
have a subjective function. Hence the simplicity of scientific laws does not 
indicate that the chosen frame of reference is privileged in any ontological 


` or objective sense; in fact the frame as such, being unobservable, does not 


belong to Mach’s ontology. There is no good reason, from a Machian 
viewpoint, why the choice of a certain coordinate system should not 
prove more convenient, t.e. more economical,.than another. There is in 
fact every reason, from a Machian viewpoint, to congratulate a scientist 
on finding the most convenient coordinate system for the simplest, t.e. 
for the most economical, formulation of physical laws.* In proposing 
general covariance, Mach was therefore violating a tenet of his own brand 
of positivism; he was instinctively following a philosophy of science with 
strong Platonistic undertones: apart from the elements of sensations, the 
scientific theories which express the structure of the world have an onto- 
logical status, so that the form of our theories reveal an objective property 
of the frames to which they are referred. 


London School of Economics 
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INTRODUCTION 


In the seventeenth century a famous debate occurred between Newton 
and Leibniz on the question of whether space is an ‘entity in its own 
right, or merely a system of relations of bodies. Near the end of the nine- 
teenth century Mach attempted to defend the Leibnizian relationist view 
by putting forth what is now called Mach’s principle: (roughly) that the 
fixed stars can play the role in physics which Newton attributed to space. 
And early in the twentieth century Einstein claimed to have vindicated 
Mach’s critique of Newton by embodying this principle in his general 
theory of relativity. Eventually, however, it became apparent that in 
many respects general relativity is inconsistent with Mach’s point of view. 
Supporters of Leibniz and Mach may therefore be heartened by the recent 
development of a new version of general relativity—the initial-value 
formulation—which is in many ways closer to the Machian point of view 
than Einstein’s original version. The main purpose of this paper is to 
determine just how close it is. I will argue that the roles which space and 
spacetime play in the new formulation—especially their causal roles in 
certain senses—make it fundamentally anti-Machian. 

There is already a massive literature on the issues surrounding absolute 
vs. relational theories of space, Mach’s principle, and general relativity. 
But it seems to me that even the best of this literature suffers from two 
rather serious defects: (1) it does not take account of the post-Einsteinian 
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development of the initial-value formulation of general relativity; (2) dis- 
cyssions of Mach’s principle are often purely technical and thus detached 
from their original philosophical context. For example, Sklar ([1974], 
Pp. 221) notes that the jnitial-value formulation is not subject to his main 
objections to the intarporation of Mach’s principle into general relativity, 
and then merely remarks that ‘it is still far from clear that the resulting 
modified theory is yet in conformity to Machian expectations in their 
fullest extent’. Similarly, Earman ([1970a]) makes only a few brief (though 
interesting) remarks about the relevance of the initial-value formulation 
to Machian philosophy. Graves ([1971], ch. 17) gives a clear and extensive 
discussion of the technical aspects of the new formulation, but says almost 
nothing about its implications for the Newton—Leibniz~Mach dispute. 

Since my principal aig will be to discuss the implications of some 
scientific developments, especially the initial-value formulation, for the 
historical dispute between absolute and relational theories of space, I 
shall begin by reviewing the relevant history. Since some of this history 
will be familiar to many readers, I shall be quite concise in dealing with 
well-known points, and treat in detail only relatively neglected ones. 

A subsidiary theme of the paper will be the distinction between two types 
of relational theory: that which says that space is merely a set of relations 
of bodies, and that which says that metrical properties of intervals are 
relations to congruence standards. I shall therefore devote some attention 
to the question of which historical figures held which view(s), and to the 
relevante of general relativity to each. 


“I NEWTON 


In his Principia, originally published in 1687, Isaac Newton espoused the 
following theses, which one may,regard as aspects of an absolute or sub- 
stantival theory of space, and as denials of both types of relationism just 
mentioned. 

(A) Absolute space exists as an entity in its own right, not merely as a 
system of relations of bodies. 

(B) There is absolute motion. 

(C) Absolute motion is motion with respect to absolute space. 

(D) Distance is a two-place real-valued function of bodies or points, 
and is not to be identified by definition with the results of any 
particular sort of measuring process. 

(A){C) are easy to document: Newton ([1934], pp. 6~7) says, ‘Absolute 
space, in its own nature, without relation tq anything external, remains 
always similar and immovable.’ Further, ‘[absolute] place is a part of 
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[absolute] space which a body takes up’, and ‘Absolute motion is the trans- 
lation of a body from one absolute place into another’. Newton also notes 
that there are coordinate systems of absolute space which have a body at 
some fixed coordinates, and calls such coordinate Systems ‘relative spaces’. 
Relative motion can thus be defined as change of coordinates in a relative 
space, or as motion with respect to some body. 

(D) is somewhat more difficult to document. Newton does say (p. 11) 
‘Nor do those less defile the purity of mathematical and philosophical 
truths, who confound real quantities with their relations and sensible 
measures’, which certainly suggests (D). However, the context concerns 
_ the distinction between absolute and relative motions and this quotation 
may refer to that distinction, rather than to that between a quantity 
(whether absolute or relative) and the result of a measurement. But since 
Newton (pp. 6-8) did explicitly distinguish definitionally between duration 
in absolute time and the number of cycles of any physical process, it is 
likely (though not certain) that he espoused the analogous spatial thesis— 
viz., (D). 

Since absolute space cannot be seen, we obviously cannot detect absolute 
motion in any fashion analogous to the detection of relative motion by 
observation of a change of distance between two bodies. Nor, according 
to Newton (p. 20), can we tell from the motions of bodies in some co- 
ordinate system whether that system is absolutely at rest or in uniform 
rectilinear motion; for the motions are the same in either case. Rather, 


The effects which distinguish absolute from relative motion are, the forces of 
receding from the axis of circular motion. . . . For instance, if two globes, kept 
at a given distance one from the other by means of a cord that connects them, 
were revolved about their common center of gravity, we might, from the tension 
of the cord, discover the endeavor of the globes to recede from the axis of their 
motion, and from thence we might compete the quantity of their [absolute] 
circular motions. (pp. 10-12) 


In general, in certain coordinate systems all terms in the accelerations g 
bodies will be due to forces exerted by other bodies.! Such frames are 
called inertial. In non-inertial frames an additional centrifugal term ap- 
pears. According to Newton the non-inertial frames are precisely those 
which are in non-uniform absolute motion. Thus, the existence of accelera- 
tive tendencies not due to physical interactions of bodies establishes the 
existence of absolute motion and by (C) of absolute space. 

An important question in this paper will be in what senses, if any, 
Newton (and other theorists) attribute causal efficacy to space. As we shall 
see later, Einstein ([1916]) claimed that Newton considered absolute 


1 Ascription of this view to Newton is justified by quotations in the next paragraph, 
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space to be a cause of the phenomena (such as the flattening of a sphere) 
associated with rotational motion. Let us consider various senses in which 
Einstein’s exegetical claim might be understood. Newton certainly did 
not maintain that space*is causally efficacious in the sense that particular 
points of space can be the origins and centers of forces, that ‘place . . . exerts 
a certain influence’ (Aristotle, Phys. 208b). For Newton says: ‘That 
forces should be directed to no body on which they physically depend, 
but to innumerable imaginary points on the axis of the earth, is an hypo- 
thesis too incongruous’ (p. 553), and ‘attractions are made towards bodies’ 
(p. 164). Nor does Newton think that space is efficacious in the following 
sense: space has a particular one of various logically possible structures, , 
and bodies follow the trajectories they do partly because of the particular 
structure space has. (Let us call this conception space as a passive cause, 
to indicate that no changes ín the structure of space figure in the explanation 
of motions.) For, writing two centuries before Riemann, Newton was 
hardly in a position to consider the possibility that space might have a 
structure—e.g. variable curvature—different from the one it actually has. 
(This paragraph follows Shapere [1964].) 

Einstein’s historical sense was presumably good enough that he did 
not mean to attribute either of these two senses of space’s causal efficacy 
to Newton. Much less could he have meant to say that Newton considered 
space to be what I shall call an active cause: a physical entity to whose 
changes one can correctly appeal to explain other events or conditions. For 
Einstein surely knew that Newton said that absolute space ‘remains always 
similar and immovable’. The only possibility left, so far as I can see, is 
that Einstein simply meant that Newton attributed a causal role to space 
in what I shall call the ittial-condition sense: a physical entity has a causal 
role in this sense if it would be correct to mention it in the initial conditions 
of an explanation of an event or cpndition. This sense is weaker than the 
other two, which it includes as special cases. The restriction to physical 
entities is necessary to exclude such objects as numbers and functions. 
‘Physical’ is intended broadly enough to include space. The term is vague, 
but precisely to the degree that Einstein’s claim is vague, I believe. Clearly, 
Newton did believe that absolute space is a cause in the initial-condition 
sense; for in explaining accelerations not directed towards bodies he 
mentioned absolute space in ‘his initial’ conditions—viz., motion with 
respect to absolute space. 


2 LEIBNIZ 


Leibniz, in his correspondence with Clarke in 1715-16, maintained the 
following version of a relational theory of space: 
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(A’) Space is not an entity in its own right, but merely a set of ways in 
which simultaneously existing bodies could be ordered. (Alex- 
ander [1956], pp. 25 f.) ' 

(B’) All motion is relative to some body. : 

(C’) There is absolute motion of a body B, but it is not motion with 

l respect to absolute space, but relative motion whose cause is in B. 
(Ibid., p. 74) 

Leibniz’s claim that space is an ‘order of coexistences’ suggests that the 

relations between bodies of which he thought space is a set are solely 
relations of order and not of quantity—or, in modern’ terms, that they are 

„all topological (preserved under one-to-one bicontinuous transformations) 
and not metric (preserved under rigid transformations): It thus may appear 
that Leibniz denies Newton’s thesis that distance is a genuine relation 
between bodies. 

Clarke takes him this way (and analogously for time), as his criticisms 
indicate: ‘space and time are quantities, which situation and order are 
not’; ‘that time is not merely the order of things succeeding each other is 
evident; because the quantity of time may be greater or less, and yet 
that order continue the same’ (ibid., pp. 32, 52). 

A rational person is very unlikely to hold a theory of space in which. 
distance plays no role at all. Accordingly, van Fraassen ([1970], pp. 70-3) 
interprets Leibniz as holding that distance is not a relation of two arbitrary 
bodies, but a relation which they bear to a third object called a congrúence 
standard. 'Thus, the distance between two bodies is by definition the mini- 
mum number of times the congruence standard can be laid along a path 
between them. On this view Leibniz is a relationist in the further sense 
that he holds 


(D’) Distance is a real-valued three-place function of pairs of bodies 
and congruence standards, whose values are defined by a measuring 
procedure involving the standard. 


However, when Leibniz replies to Clarke’s objection, he writes: 


I answer, that order also has its quantity; there is in it, that which goes before, 
and that which follows; there is distance or interval. Relative things have their 
quantity, as well as absolute ones. For instance, ratios or proportions in mathe- 
matics, have their quantity, and are measured by logarithms; and yet they are 


1 Note that van Fraassen does not attribute to Leibniz the Reichenbach—Griinbaum view 
that it is a matter of convention whether a rod remains self-congruent under transport. 
On van Fraassen’s interpretation, presumably all that is conventional is the choice of a 
unit of length. 

In discussing the second type of relationism in this paper, I shall ignore the fact that 
it is sometimes associated with the doctrine of conventionality of self-congruence. The 
sole question I am discussing ‘is whether metrical properties are intrinsic rather than 
relational. 
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relations. And therefore though time and space consist in relations, yet they have 
their quantity. (Alexander [1956], p. 75) 


While this is not as lucid as one might wish, I suggest that what Leibniz 
means is that in his definition of ‘space’, the-term ‘order’ is to be taken 
not in the usual (topological) sense, but as including from the outset 
relations of distance: ‘there is in [order] . . . distance or interval’. And in 
comparing distance to proportion, as both being ‘relative things [which] 
have their quantity’, he means merely that they are both real-valued two- 
place functions, contrary to (D’). 

The passage on which van Fraassen relied is in The Metaphysical 
Foundations of Mathematics (c. 1715-16). There Leibniz says ’ 


Quantity or magnitude is thgt in things which can be known only through their 
simultaneous compresence—or by their simultaneous perception. Thus it is im- 
possible for us to know what a foot or a yard is unless we actually have something 
to serve as a measure which can be applied to successive objects after each 
other... . Quality, on the other hand, ts what can be known in things singly, without 
requiring any compresence. ([1969], p. 667) 


This passage may seem to suggest that Leibniz holds that since the dis- 
tance between two bodies is a quantity, it consists in their relation to a 
congruence standard, à la (D’). But all he says is that it is through a 
congruence standard that we can know that a particular length or distance 
exists. He does not say that the very existence of the quantity depends 
on the existence of the standard. 

This interpretation is supported by an earlier passage, in which Leibniz 
says: ‘In each of the two orders—that of time and space—we can judge 
relations of nearer to and farther from between its terms, according as 
more or less middle terms are required to understand the order between 
them’ (ibid.). In the case of duration, ‘more middle terms’ means ‘more 
successive and like states interposed’ (Alexander [1956], p. 89) between 
a process’s end points. In the spatial case, it is entirely unclear what he 
could mean if not that the length of a path is proportional to the number 
of points it contains. (In both cases there are obvious difficulties in his 
position which are not relevant just now.) His position therefore seems to 
be that length (and duration) exist whether any standard does or not, and 
that a standard is needed only for the purpose of determining and express- 
ing a quantity in terms of the unit defined by the standard. 

I conclude that Leibniz is not a relationist in the sense of (D’), and that 
his version of relationism must therefore be distinguished from that of 
Reichenbach ([1957], p- 37), who claims that geometrical statements 
characterize relations to rigid rods, or *Griinbaum ([1963], pp. 10 f.), who 
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only relatively to a particular congruence standard. 


. 


3 MACH : 


On the ground that ścience is merely the economical description of 
sensations ([1960], p. 579), and that absolute space and absolute motion 
do not appear in our experience, Mach espoused in 1883 the following 
relational theory, similar to Leibniz’s: 


(A”) Absolute space does not exist. 

(B”) Absolute motion does not exist: all motion is relative to some other 
body or bodies. (pp. 280 f.) 

(C”) The fixed stars, not absolute space, define the class of inertial 
frames. (p. 285) 


How, then, does Mach propose to deal with Newton’s inference from 
centrifugal tendencies to absolute motion and thence absolute space? 
{Leibniz dealt with this problem by asserting that all frames of reference 
are mechanically equivalent ({1969]), p. 418}—but did not in any passage 
known to me explain his grounds.) There are two problems: first, how 
can one specify the class of frames in which centrifugal tendencies do not 
appear (Newton’s laws of motion hold) without referring to absolute 
space? Second, what shall we say causes such forces, if not rotation with 
respect to absolute space? Mach’s answer is that the so-called fixed stars 
can fill both roles. That is, they determine a frame with respect to which 
Newton’s laws hold. Moreover ‘centrifugal forces . . . are produced by 
[a body’s] relative rotation with respect to the mass of the earth and the 
other celestial bodies’ (p. 284). Presumably, this means that large masses 
exert some sort of influence on a body rotating with respect to them and 
thereby produce centrifugal tendenoiés. (On Mach’s view it would make 
only a verbal difference if we said instead that the large masses rotate 
with respect to the other body (#b:d.).) Unfortunately, Mach did not at- 
tempt to specify the laws in virtue of which large masses produce these 
effects. 


4 EINSTEIN 


In his 1916 paper on general relativity, Einstein indicates his agreement 
with Mach’s critique of Newton. He writes that no explanation of the 
flattening of one of two spheres S, and S; in relative rotation can 


be admitted as epistemologically satisfactory, unless . . . observable facts ulti- 
mately appear as causes and effects. 
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Newtonian mechanics does not give a satisfactory answer... [for] the 
privileged space R, of Galileo . . . is a merely factitious cause, and not a thing 
that*can be observed. 

[Instead], we have to take it that the general laws of motion... must be such 

that the mechanical behavior of S, and S, is partly conditioned, in quite 
essential respects, by distant masses. ([1916], pp. 112, f.) 
In a later work ([1922], p. 108) Einstein espoused a similar principle: 
‘the mechanical properties of space [are] completely determined by 
` matter’. The phrase ‘the mechanical properties of space’ is evidently a 
somewhat imprecise reference to ‘the properties of the space-time con- 
tinuum which determine, inertia’, as he put it elsewhere (ibid, p.-56). To 
determine inertia means, in part at least, to determine which frames are 
inertial, or in a four-dimensional context, locally Lorents—having a 
Minkowskian metric with ztro derivatives at a given spacetime point and 
thus no centrifugal (or Coriolis) accelerations. (Einstein also (tbid., p. 100) 
took inertia in this context to refer to magnitude of inertial mass; but 
since I see no warrant for this in Mach, or relevance to disputes about 
relational theories of space, I propose to ignore this point.) I shall, then, 
call Mach’s principle the statement: the matter of the universe fully deter- 
mines which space-time frames are locally Lorentz. One of Einstein’s 
(ibid., pp. 56, 100) main motives in constructing-the general theory of 
relativity was to vindicate this principle with a precision inaccessible to 
Mach himself. 

One further piece of history which will shortly be relevant concerns the 
cosmological term in the field equation of general relativity. In his paper 
of [1915], Einstein gave the field equation in a form equivalent to G = 8xT, 
where G is the Einstein tensor, measuring the curvature of spacetime, and 
T is the stress-energy tensor, specifying the mass-energy density, momen- 
tum density, and stress at each point. In [1917], however, ‘for the purpose 
of making possible a quasi-static distribution of matter’—that is, one in 
which the universe does not have increasing or decreasing radius—he 
modified the equation to read (in the present notation): G-+-Ag = 8rT, 
with g the metric tensor and A the ‘cosmological constant’, However, 
after Hubble’s discovery of the expansion of the universe, Einstein ([1931]) 
thenceforth omitted the cosmological term, since what he had taken to be 
the empirical necessity for its introduction turned out to be mistaken. 
Moreover, its presence is highly undesirable from a theoretical point of 
view, since it ‘seriously reduces [the equation’s] logical simplicity’ (Einstein 

1 Lest the name mislead, note that the principle was authored by Einstein and merely 

inspired by Mach: the latter’s ‘fixed stars’ have become ‘the matter of the universe’, 


- and the ‘causation of centrifugal forces’ has become the ‘determination of inertia (the 
local Lorentz frames)’. 
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[1945], p. 111). Gamow ([1970]) reports that Einstein told him he regarded 
the cosmological term as ‘the biggest blunder of my life’. 


5 EINSTEIN vs. MACH’ ° 


Despite Einstein’s hopes, numerous objections have been raised to the 
proposition that his general theory of relativity embodies Machian ideas. 
Some of the most important are the following: 


(1) In Gédel’s solution to the field equation, the entire mass of the 
universe is in absolute rotation, which is nonsensical according to 
Mach’s thesis of the relativity of all motion. Moreover, if as Mach 
believed, inertial frames are those unaccelerated with respect to the 
main bulk of the universe, Gédel’s universe should be at rest in an 
inertial frame. But it is not (Earman [19@70a]). 

(2) To obtain a particular solution to the field equation, it is often 
necessary to use not only the distribution of matter embodied in 
the stress-energy tensor, but also boundary conditions at spatial 
infinity. In such cases it is not matter alone (or more generally, T 

, alone) which determines the spatiotemporal geometry and thus the 
class of locally Lorentz frames (zbid.). 

(3) The existence of solutions of the field equation for empty spacetime 
—e.g. the solution g = the Minkowski metric of special relativity— 
shows that according to general relativity spacetime may have a 
structure not due to matter (tbtd.). 

(4) It makes no sense to speak of the distribution of matter determining 
geometry, since the geometry is required in order to specify the 
distribution (Wheeler [1964], p. 306). For example, the metric 
tensor appears explicitly in the stress-energy tensor of a perfect 
fluid. 

(5) The four-dimensional character of the field equation pikes it ill- 
suited to describe the sort of causal influence asserted by Mach’s 
principle. One solves the equation by specifying T and perhaps 
boundary conditions throughout spacetime. Thus it is unclear how 
one could use the equation in the usual method of studying causal 
influences: specifying initial conditions at one time and using laws 
of evolution to determine the resylting conditions at a later time 


(Graves [1971], pp. 235-7). 


6 RECONCILIATION? 


I shall not attempt in this paper to consider every possible attempt to 
demonstrate non-Machian aspects of general relativity. Instead, I shall 
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simply consider the extent to which a person concerned to defend the 
Machian character of general relativity could respond to the above five 
claitns, which are certainly among the most striking and interesting of their 
kind. . Í 

The first seems easiest to deal with. As Adler èt al. show ([1965], ch. 12), 
one can define a quantity 2° = &#"(—g)V,V,,;,, called the vorticity of 
the vector field V”. If V” is the four-velocity of matter, 2° at a point P 
indicates the rotation of matterat P about an inertial observer there. However, 
for the Gédel universe it turns out that the angular velocity w of this rotation 
is ./—A, from Adler’s equations 12.52 and 12.69. Thus, if we follow 
Einstein’s view from 1931 to his death and set A = o, the rotation vanishes. 
It is therefore rather misleading to cite this solution as a non-Machian ° 
aspect of Einstein’s general theory of relativity, since it requires a version 
of the theory (with A # 0)*which Einstein regarded as only a temporary 
aberration and, indeed, his ‘greatest blunder’. 

But this Machian reply turns out to be insufficient. For Gédel ([1950]) 
has shown that there are solutions to the field equation with A = o for 
which {2° is everywhere non-zero. (In fact, given a solution with A # o, 
one can regard it as a solution with A = o by the simple expedient of 
transporting Ag to the other side of the field equation and absorbing it 
into the stress-energy tensor. The resulting T may not be very plausible 
physically, but then again neither is much else about the Gédel solution, 
such as the possibility of closed causal chains.) A better Machian reply to 
Gödel is therefore that ‘QF = o everywhere’ is an inadequate criterion 
for rotation of the material universe as a whole. That 2’ = o everywhere 
guarantees only that each local Lorentz observer would see nearby matter 
swirling around him, but not that any type of net rotation exists. Moreover, 
it is unclear that any other criterion of net rotation can be stated in terms 
of vorticity. Isenberg ([1977]}—whom I follow in this paragraph—shows 
that one plausible definition of net ‘rotation as a certain volume integral 
involving local vorticity leads to its being zero in some cases where 92° is 
a non-zero constant. Thus there is no obvious connection between local 
vorticity and “net rotation”. In the absence of some more satisfactory 
definition of net rotation in terms of ° than the two rejected above, the 
Machian can ignore local vorticity, since Mach’s principle does not assert 
that a local Lorentz frame is determined by and at rest with respect to the 
matter in its vicinity. 

Another approach to defining net rotation utilises angular momentum 
rather than vorticity. As Misner, Thorne, and Wheeler show ({1973], 
pp. 450-63; hereinafter MTW), an isolated body’s intrinsic angular 
momentum can be defined as a certain*three-véctor which appears in the 


Relationism and Relativity 225 


metric governing tHe asymptotically flat region far from the body, or in 
terms of a flux integral over a surface in this region. The quantity thus 
defined can be measured by means of the precessions of distant gyto- 
scopes or satellite-orbits. Obviously, rotation defined as intrinsic angular 
momentum makes no theoretical or experimental sense in the absence of 
asymptotic flatness. Now general relativity does admit solutions where 
matter is confined to a finite spatial region of an asymptotically flat space- 
time and has a non-zero total angular momentum—+.g., the Kerr-Newman 
geometry for a rotating black hole (MTW, p. 891). One who wishes to 
reconcile general relativity with Mach must therefore concentrate on 
finding some way to rule out the measurement of a non-zero total angular 
“momentum by a distant local Lorentz observer. (Such a situation would 
clearly be un-Machian, since the rotation would not be with respect to any 
matter, and since the local Lorentz frames would not be determined by 
matter but only perturbed by it in a relatively small region.) 

Explaining how this can be done—and objections (2)-(5) answered as 
well—requires introducing a new and significantly different formulation 
of general relativity, which some writers—e.g. MTW (ch. 21) and Graves 
([1972], ch. 17)—consider the proper way to incorporate Mach’s principle 
into the theory. I shall try to explain this ‘initial-value formulation’ 
(York’s version) in a relatively non-technical fashion. The basic idea is to 
begin with the general-relativistic analogue of an initial instant, to specify 
certain initial data at that “instant”, and then to use the equations of the 
formulation to obtain the spatiotemporal metric throughout spacetime, 
thereby determining the class of local Lorentz frames at each spacetime 
point. We replace the notion of an instant with that of a spacelike hyper- 
surface S: a three-dimensional submanifold of spacetime such that any 
displacement dx* lying wholly within it is spacelike (gdx"dx’ > o). The 
reason that this notion corresponds to that of an instant—or, more closely, 
to space-at-an-instant—is that at each point P of S there is a local Lorentz 
frame whose surface of simultaneity coincides locally with S. One can 
think of S as formed by the meshing of small pieces of the simultaneity- 
surfaces of various local Lorentz frames, We then proceed as follows: 
(a) select a particular spacelike hypersurface S which is closed (finite) and 
specify the time parameter on it. (b) The following Machian data may 
then be chosen arbitrarily: (i) the conformal three-metric, which specifies 
the intrinsic geometry of S up to a position-dependent scale factor; 
(ii) the shear part of the rate of change of this metric; (iii) the density of 
non-gravitational energy and energy flow on S, up to a conformal factor. 
(c) Insert the Machian data in the components of the field equation known 
as the initial-value equations and sdlve for the physical data: the full (not 
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just conformal) geometry of S, both intrinsic and extrinsic (relative to the 
enveloping spacetime), and the full density of energy and of energy fow 
on’ S. (d) Using the rest of the components of the field equation (and 
initial conditions and laws of evolution of any fluid and non-gravitational 
fields), obtain the fowr-metric of the entire spate-time, or in some cases of 
the part of it called the "domain of dependence’. The point of distinguishing 
the two kinds of data is that the Machian data can be specified arbitrarily, 
whereas the physical data are guaranteed to satisfy the initial-value equa- 
tions. The step from the physical data to the four-metric is known always 
to yield a unique solution within a finite time, and for the whole space- 
time in favourable universes. Existence and uniqueness of the physical 
data, given arbitrary Machian data, has been proven for all physically’ 
plausible fields (except for some isolated data-sets known to give no 
solution). The requirement of closure is used in step (a) because of a 
conjecture by Wheeler and others that it guarantees existence and unique- 
ness. (For further details and references see Isenberg ([1976], [1977]), 
MTW (ch. 21), and O Marchadha ([1974]).) 

Let us reflect on the significance of all this for the question of the 
Machian character of general relativity. Assuming that its implicit restric- 
tions on the universe (closure, efc.) turn out to be true, the initial-value 
formulation provides a clear sense in which the universe’s matter (or 
rather its relativistic correlate, mass-energy) determines which frames are 
locally Lorentz, as Mach’s principle asserts. We now no longer have the 
mere hint of a suggestion for a non-Newtonian theory of inertia, but the 
actual equations of one. Moreover, we seem at first glance now to have an 
answer to objections (1)-(5) in the last section. The Gödel solution 
mentioned in (1), whether or not it has net rotation in any sense, is ruled 
out on the ground that it contains closed timelike lines and therefore cannot 
be sliced into spacelike portions. Moreover, rotation defined by observers 
at spatial infinity is ruled out by the requirement of spatial closure. (2) no 
longer applies, since the requirement that S be closed means that there is 
no spatial infinity and thus no problem of imposing boundary conditions 
there to help the mass-energy determine inertia. Of course, closure is 
itself a boundary condition. But it is not a boundary condition on the 
inertia: it is not an inertial property of spacetime specified a priori, inde- 
pendent of mass-energy. There is no longer any circularity as asserted 
by (4): the Machian data specify the conformal three-metric on S, whereas 
the solution is the entire four-metric. (5) is inapplicable, since now the 
dynamical features of the theory stand out in clear relief. (3}—the existence 
of general-relativistic universes empty of matter and non-gravitational 
fields—may still seem to be a problefn. However, MTW (p. 549) argue 


. 


Relationism and Relativity 227 


that if we impose the requirement of closure on such a universe, it will 
necessarily contain gravitational radiation whose energy is responsible 
for the curvature. If so, one can still say in Mach-like fashion that mass- 
energy determines inertia. But as MTW themsel¥es note on page 725, a 

possible solution can be obtained by taking a flat hypersurface, empty of 
everything including gravitational radiation, and identifying opposite 
faces of a cube. The resulting 3-torus is closed and is therefore suitable for 
use in step (a) to obtain a totally empty spacetime whose geometry is 
therefore not determined by mass-energy. The possibility of this universe 
therefore remains a limitation on the Machian character of the new 


formulation. 


The suggestion to require closure was also made by Einstein ({1922], 
p. 108), on the ground that otherwise the necessity for boundary conditions 
at infinity would refute Mach. However, he stated the requirement as the 
closure of space. But as Earman ([1970b]) has noted, a given spacetime 
may possess both open and closed spacelike hypersurfaces, so that the 
question of whether a given four-metric satisfies Einstein’s requirement 
is ill-framed. However, as Earman noted in a later paper ([1972]), this is 
not a problem for the initial-value formulation. In order for its equations 
to be repeatedly integrated to obtain the full four-metric, one need only 
require that spacetime be covered by a family of closed spacelike 
Cauchy surfaces, so that there is a suitable starting point for each 
iteration. That there may be open spacelike hypersurfaces as well is 
unimportant. 

The issue of closure does raise a genuine and admitted en for the 
initial-value formulation—namely, that present astronomical data’ are 
insufficient to determine whether the requirement just mentioned is 
satisfied by the actual universe (MTW, ch. 29). Worse, many workers 
have recently said that the available evidence suggests openness—e.g. 
Gott et al. ([1976]). Thus, full success of the new programme awaits 
empirical validation. Observational falsification would not, of course, 
affect its value as a precise explication of Mach’s principle. 

My concern, however, is not to assess this programme from a scientific 
point of view, but rather to ask the question: to what extent would its 
success show (1) that general relativity does embody a Machian philo- 
sophical position and thus (2) that Mach was right all along in his critique 
of Newton? (Negative answers to these questions need not distress 
scientific advocates of the programme, of course, since they may rightly 
maintair that its scientific value is independent of its relationship to 
Machian philosophy and to Einstein’s Machian motivations.) What I 
shall maintain is that though the’ initial-value formulation does give 

P 
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specific content to Mach’s idea that matter there governs inertia here, it is 
ingonsistent (at least prima facte) with every one of Mach’s philosophical 
reasons for this idea and with Einstein’s motivations for invoking it as 
well. (Again, I do not say that Wheeler and company claim to be faithful 
to Mach in all respects. The relations of their work to Mach’s are none- 
theless worth discussing.) 

Let us recall that one of Mach’s main aims was to claim that absolute 
space, because it is unobservable, does not exist, and therefore to refute 
Newton’s arguments to the contrary. However, general relativity, old or 
new formulation, obviously postulates an entity called spacetime, on which 
its various tensors are defined. Being unobservable, it would presumably , 
be just as objectionable to Mach as Newton’s absolute space. Indeed,’ 
under certain conditions, general relativity is committed to the existence 
of space as well. “Temporal orientability—intuitively, the existence of a 
globally consistent time direction—means the existence of a timelike 
vector field defined throughout spacetime. If in addition this field has zero 
vorticity, the three-parameter set of curves to which the vectors are tangent 
may be considered (the worldlines of) points of an enduring three-space. 
For details, see Earman [1970c]. Of course, since a temporally orientable 
spacetime may well have more than one everywhere-defined timelike 
vector field, it may also allow one to project out more than one space. 
Moreover, a general relativistic three-space will not necessarily be absolute 
in the sense of ‘unchanging’, since it may have temporal slices which differ 
in structure. But uniqueness and unchangingness were not Mach’s com- 
plaints against Newton. (Unchangingness was one of Einstein’s ([1922], 
pp. 55 f.).) 

In addition to denying with Mach that absolute space exists, Einstein 
was also concerned to deny that any such unobservable entity could be 
invoked as a cause. The most serious defect of the initial-value formulation 
as an embodiment of Machian—Einsteinian philosophy is that it attributes 
a causal role to space in a very strong sense. First of all, the Machian data 
include the geometry of the initial spacelike hypersurface, and its rate of 
change, so that the general-relativisticanalogue of space-at-a-time is involved 
as an active cause of all phenomena governed by the four-metric. More- 
over, the initial energy and energy-flow must include—in steps (i) and 
(ii)—that contained in gravitational radiation (MTW). But since gravita- 
tional radiation is merely propagating ripples in the metric of spacetime 
and hence space, space serves as a cause not only in the sense that its 
overall structure’s rate of change figures in the initial conditions, but also ` 
in that localised events consisting of small-scale changes in its structure 
do so. Thus, in general relativity space is a cause not only in the initial- 
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condition sense, for‘ which Einstein criticised Newton, but also in the 
stronger active sense.? 

Is the initial-value formulation, then, inconsistent with Mach’s dental 
that space is an independent entity? Since no matter or non-gravitational 
fields need be present, a spacelike hypersurface cannot be reduced to 
possible relations of actudl entities of these kinds. But could it be reduced 
to possible relations of posstble bodies or fields? Nothing in the initial- 
value formulation decisively rules out such a reduction, so far as I can 
see. A full discussion of the prospects of a reductive elimination of space 
is beyond the scope of this paper. I wish only to point out that the initial- 
value formulation at least fails to support Mach’s claim that one need 
appeal only to the interactions of bodies and not to space to explain centrifugal 
forces. The formulation is prima facie inconsistent with Mach’s claim, 
since some reduction not provided by the formulation itself would be 
required to resolve the apparent inconsistency due to the invocation of 
space. 

One can go further, though, and say that the new formulation makes 
such a reduction even less plausible than it would otherwise be. As Sklar 
has noted ([1974], pp. 171-3), a strong argument against the view that 
space is reducible to possible relations of merely possible matter and/or 
non-gravitational fields is that talk of possible phenomena, which would 
be actualised under specified circumstances, only makes sense if some 
underlying actual reality is presupposed. Thus, for example, ‘solubility’ 
is plausibly regarded as a place-holder for a description of a (possibly 
unknown) microstructure, or ([Quine] 1960, §46) as indicating that some 
object with the same microstructure dissolves at some time. If this is 
correct, it is impossible to construe talk of empty space as a manner of 
speaking of possible relations of merely possible matter and fields. For 
there would be no actual object whose structure legitimates such talk. 

Whatever the strength of the foregoifg argument and its theory of dis- 
positions, it is surely stronger if, as in the initial-value formulation, space 
is invoked as a cause. For how could the possible relations of merely 
possible matter and fields manage to exert a causal influence on later events 
and circumstances? Dispositions, like fragility, may certainly serve as 
initial conditions in the explanation of events, like breakage. But, in familiar 
cases at least, this is reasonable only because an underlying structure is 
presupposed as responsible for the events. I do not take this argument to 
be decisive, since it is always dangerous to argue about modern physics 


4 Concededly, the Machian data do not refer to space in the technical sense of manifold- 
with-metric, but only to what might be called a conformal space: manifold-with-conformal- 
metric. But Einstein and Mach have no less ground to object to invoking this as a cause 
than to invoking space in the strict sense. 
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from the standpoint of intuitive preconceptions about familiar phenomena 
such as dissolving. Still it is fair to say that a relational theory of space is 
even less plausible in the context of the initial-value formulation than in 
the context of the origipal formulation. f 

Finally, I turn to the question of Mach’s views on the relativity of all 
motion to some hody br bodies. Here again it is elear that the initial-value 
formulation is inconsistent with Mach’s view. Since it permits the possi- 
bility that the only mass-energy present is in the form of gravitational 
radiation, and since this radiation moves through spacetime along null 
geodesics, evidently the new formulation allows for the possibility of—and 
thus requires the intelligibility of—motion which is not relative to any 
body. One way to see the divergence of the new formulation from Mach’s- 
own view is to note that for him the fixed stars played a dual role: first, 
to serve as causes of centtifugal tendencies; and, second, to serve as the 
bodies which define the basic inertial reference system required to state 
the laws of motion. By contrast, the causes of inertia in the initial-value 
formulation cannot always serve this second function as well. The mass- 
energy whose distribution and flow are specified on the initial hypersurface 
may be in the form of bodies capable of defining a system of reference, 
but it need not be. It could instead, for example, be in the form of electro- 
magnetic or gravitational radiation. Thus the causes of inertia need not 
provide any bodily reference points for the description of motion. 

I conclude that despite the scientific interest of the initial-value formula- 
tion, and despite the fact that it does provide a precise version of (something 
like) Mach’s principle, it is inconsistent with two of the main elements 
of Mach’s anti-Newtonian position—the relativity of all motion, and the 
consequent necessity to formulate the laws of motion with reference to the 
fixed stars—and inconsistent as well with Einstein’s denial that space is in 
any sense a cause. Moreover, it is at least prima facie inconsistent with 
(and thus fails to support) Mach’s-denial that space exists as an entity in 
its own right. 


7 METRICAL RELATIONISM 


I would like to conclude with a few remarks on the implications of general 
relativity for the second type of relational theory mentioned in my intro- 
duction. In the version discussed by Griinbaum, it is the thesis that space- 
time has a metrical structure only in virtue of the presence in it of such 
objects as atomic clocks, free particles, photons, and solid rods.1 (As we 
1 Note that Griinbaum requires the actual presence of these objects, rather than saying 


that metrical properties of spacetime are defined in terms of the behaviour of Aypothetical 
entities moving in it. s i 
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saw in section 3, a similar thesis for space has been attributed incorrectly 
to Leibniz.) In a recent paper ([1976]), Griinbaum states this thesis, but 
does not firmly commit himself to it, as he did for space in earlier work 
([1963], p. 11). 

Offhand, it would seem obvious that metrical relationism is refuted by 
the existence of the empty-spacetime solutions (with or without gravita- 
tional radiation) discussed above, for they attribute to spacetime a definite 
structure despite the absence of any objects whose behaviour reflects that 
structure (Earman, [1970a]). But Griinbaum replies that metrical rela- 
tionism can be defended by claiming that in the case of an empty space- 
.time, g can simply be interpreted as a tensor field, not one whose specific 
function is to specify intervals of spacetime (g,,dx"dx”}—1.e. not as a 
metric tensor. If a spacetime contains metrig standards only in some 
regions, then g is to be interpreted metrically only in those regions. 
However, if there is sufficient gravitational radiation at a point in otherwise 
empty spacetime to define enough null geodesics to permit the determina- 
tion of the metric up to a conformal factor, Griinbaum indicates he might 
be prepared to say metrical relationism has been refuted. 

The obvious trouble with this is that no one (or hardly anyone) is 
interested in the question of whether general relativity can be made com- 
patible with metrical relationism by subjecting it to an extremely awkward 
reinterpretation. On the view Griinbaum suggests, the field equation is 
an ambiguous statement whose symbols have different meanings depending 
on what solution is in question, and even on what region of spacetime 
is in question. But standard presentations of general relativity always 
define g as the metric tensor, and it is this theory’s implications which 
are of interest, not the awkward meaning-varying theory described by 
Griinbaum. 


CONCLUSION 


It is clear then that despite the suggestion carried by its name, general 
relativity—even in the apparently Machian initial-value formulation—is 
inconsistent with the version of relationism which identifies space or 
spacetime with a system of relations of actual bodies (or actual anything 
else). For general relativity allows spacetime to be totally empty. Moreover, 
the newer formulation fails to help Mach explain centrifugal forces solely 
by reference to the interactions of bodies and without invoking space, and 
decreases the plausibility of saying that space can be reduced to merely 
possible matter and non-gravitational fields. Finally, general relativity in 
either formulation is inconsistent with the version of relationism which 
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makes metrical properties depend upon the existence of physical objects 
whose behaviour is related to them.1 


University of Maryland, College Park 
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The Evolution of Empiricism: 
Hermann von Helmholtz arid the 
Foundations of Geometry °' 


by JOAN L. RICHARDS 


During the nineteenth century there was a growing trend towards 
specialisation in science. It became more and more rare to find individuals 
who remained ‘natural philosophers’, and more common to find those 
who saw themselves as ‘chemists’, ‘physicists’, ‘physiologists’, or 
‘mathematicians’. Despite this trend, which has continued with increasing 
strength until the present day, there were a number of scientific investi- 
gators who continue to move with a certain freedom from field to field, 
and who made significant contributions to many of them. The phenomenon 
of the late nineteenth-century scientific universalist poses a number of 
historio-philosophic questions. How can his concerns be understood 
within the context of a specialised discipline? In what ways do his wider 
concerns affect the choice of problems and content of his work in each 
distinct field? Concentrating on one particular case of a late nineteenth- 
century universalist, perhaps the answers to these and similar questions 
can be clarified. 

Because of his extraordinarily far-ranging interests, Henis von 
Helmholtz has frequently been cited as an example of a universalist’ who 
made a number of significant contributions in several specialised areas. 
Trained as a physician, he wrote several works devoted to an analysis of 
the physiology of the senses. His three volume work, Handbuch der 
physiologischen Optik (1856-1866), has gone through many editions and 
is regarded as pivotal in the field. In 1863, he published Die Lehre von 
dem Tonempfindungen, which is equally important to physiological 
acoustics. In mathematical physics, his paper Uber die Erhaltung der Kraft 
(1847) has earned him a substantial position in the history of physics, 
as a pioneer in the area of thermodynamics. But Helmholtz did not confine 
his work to physiology and physics, nor even to the natural sciences. He 
also explored issues in aesthetics, philosophy, and mathematics. By 
exploring these works and the nature of Helmholtz’s contributions to 
these secondary fields of interest, light can be shed on the nature of the 
interactions among his seemingly dissimilar investigations. 
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What I purport to show in this paper is that Helmholtz’s mathematical 
and philosophical works are closely related, not only to each other, but 
also to some of the major themes and interests expressed in his Phystological 
Optics. In going from ome special area to another, he was not turning his 
attention randomly to a number of disparate problems. Rather his focus 
remained firm and unswervingly fixed on one pafticular issue: the nature 
of our perception. His interest in this problem surfaces in his work in 
physiological optics, where he challenged the ‘intuitionists’ about the 
physiological nature of our perceptions. Indirectly trying to bolster his 
arguments, he pursued the specific problem of space perception into the 
area of mathematics where he explored the foundations of géometrical 
axioms. This mathematical work exposed him to some unexpected results.’ 
which led him into philosophy where his interpretation of geometry as 
a physical science caused “a great deal of discussion and interest. In all 
of these cases, his views were noticed, discussed and developed by 
specialists within the fields. 

By tenaciously following the implications of his problem of perception 
into peripheral fields, Helmholtz caused a certain consternation among 
the specialists. Their criticisms were often directed against technicalities 
however; in general the content of Helmholtz’s work was germane. In 
mathematics, it necessitated a certain amount of ‘follow through’, where 
professionals pursued his results further and presented them in more 
polished mathematical form. In philosophy, his approach suggested new 
ways of,defining problems and led to whole new areas of thought. In 
physiology, his own field of primary interest, his peripheral researches 
servtd to buttress certain views and attitudes which were difficult to 
demonstrate on purely experimental grounds. 

In each field of research, Helmholtz’s work is more determined by his 
studies as a whole than by internal developments in the field. His con- 
tributions can only serve as stumbling blocks to historians tracing strictly 
defined histories of individual areas unless this fact of his universalism 
is recognised and explored. The unit of historical research in this instance 
cannot be the specialised field, but must rather be the total intellectual 
profile. Within this unit the logic of Helmholtz’s concerns and con- 
tributions takes shape. Thus, in order to treat his mathematical work, 
it is necessary to consider in some detail that aspect of his work in physi- 
ology that led him into mathematics. Only with such an introduction do 
his mathematical interests become clear. In turn, certain philosophical 
works are directly motivated by an unexpected result he had to include 
in his mathematical investigations, and can only be fully understood in 
the context of the whole. 


The Evolution of Empiricism 237 


xı Helmholtz’s interest in the physiology of the senses can be traced to 
the time when he was a student under Johannes Miiller.1 His views on 
this subject are plainly the result of many different contextual factors, 
influences and issues, but one in particular is critcial for an understanding 
of his mathematical work. In Miiller’s law of specific sense (or nerve) 
energies, Helmholtz found an empirical confirmation of -Kant’s philosophy 
of knowledge. This point is made as early as 1855 in the essay ‘Uber das 
Sehen des Menschen’ and is clearly reiterated in the third volume of the 
Physiological Optics in the section containing a general historical study 
of the theory of sense impressions. 


, The most essential step for putting the problem in its true light was taken 
' by Kant in his Critique of Pure Reason, in which he derived all real content of 
knowledge from experience. But he made a distinction between this and what- 
ever in the form of our apperceptions and ideas was conditioned by the peculiar 
ability of our mind. Pure thinking a priori can yield only formally correct 
propositions, which, while they may appear to be absolutely binding as necéssary 
laws of thought and imagination, are, however, of no real significance for 
actuality; and hence they can never enable us to form any conclusion about 
facts of possible experience. 

According to this view perception is recognized as an effect produced on our 
sensitive faculty by the object perceived; this effect, in its minuter determina- 
tions, being just as dependent on what causes the effect as on the nature of that 
on which the effect is produced. This point of view was applied to the empirical 
relations especially by Joh. Müller in his theory of the Specific Energy of the 
Senses.* 


Helmholtz’s own researches were strongly coloured by his view that 
Miiller’s law provided experimental support for the epistemolpgical 
aspects of Kant’s philosophy. 

Helmholtz was not alone in this general approach to perception, but 
his conclusions were very different from those of another group of 
physiologists also strongly influenced ‘by Müller. Helmholtz was at the 
head of the so-called ‘empirical’ school of physiology which stood in 
opposition to the ‘intuitionists’ (‘nativists’) whose ideas were embodied in 
the writings of Ewald Hering. These schools were in sharp disagreement 
as to what degree our sense perceptions are taught by experience. The 
‘tntuitionists’ held that very little was learned through experience, positing 
innate mechanisms which physiologically determine perceptions. In 
contrast, Helmholtz assigned a large role to learning in determining our 
perceptual images. 

1 Johanes Müller (1801-1858) made contributions to physiology, anatomy, and 
zoology. From 1838 to 1842, Helmholtz studied under Müller at the University in 


Berlin. 
2? Helmholtz [1867], p. 456. (Southall trans., III, pp. 35-6.) 
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One issue which illustrates this disaccord between the schools of thought 
is that of spatial perception. On the question of the degree to which space 
perceptions are learned, Helmholtz and the ‘infuitionists’ were in complete 
disagreement. Helmholtz held that our perceptions of depth, width, up 
and down are due*tq learning from experiences in the world. The 
‘intuitionists’ attributed these sensations to intrinsic physiology. 

In a section of Phystological Optics devoted to a review of the theories 
of perception which were current in his day, Helmholtz describes the 
‘{ntuitionist’ interpretation of this problem. 

The cardinal fact about them all [intuitionist theories] is that the localization of 
the impressions in the field of view is derived through some innate contrivance, 
and either the mind is supposed to have some direct knowledge of the dimensions ` 
of the retina, or it is assumed that, as the result of the stimulation of definite 


nerve fibres, certain apperceptions of space arise by virtue of an innate mechanism 
which cannot be further defined. 


Hering, who was Helmholtz’s major antagonist on this point pushed 
furthest the idea that innate mechanisms determine the forms of our 
space perceptions. Helmholtz summarises Hering’s view as follows: 


Mr. Hering assumes that when the individual points of the retina are in the 
state of stimulation, there are three different kinds of space-feelings besides 
the colour sensations. The first one corresponds to the altitude-value 
(Héhenwert) of the given place on the retina, and the second to the azimuth-value 
(Breitenwert). ... There exists also a third space-feeling of a special kind, which 
is supposed to have equal and opposite values for each pair of identical retinal 
points; whereas for any pair of retinal points which are symmetrically situated 
these values are equal and of the same sign. The depth-feeling of the temporal 
halves of the two retinas is positive, that is, corresponds to increase of depth; 
and that of the two nasal halves is negative, that is, corresponds to decrease 
of depth? 


Thus, for Hering, not only were ‘such parts of sensation as colour and 
brightness determined by the physiology of our eye, but the conception 
of space was determined in the same way. 

In Phystological Optics, Helmholtz begins by emphasising that human 

perceptions are not due simply to physiological mechanisms, but that psychic 
processes are also involved. This fact is what he sees Hering and the 
‘intuitionists’ as having missed. . 
...to many physiologists and psychologists the connection between the sensation 
and the conception of the object usually appears to be so rigid and obligatory 
that they are not disposed to admit that, to a considerable extent at least, it 
depends on acquired experience, that is, on psychic activity. 


1 Ibid., pp. 804-5. (Southall, pp. 541-2.) e + 
? Ibid., p. 809. (Southall, p. 547.) 3 Ibid., p. 431. (Southall, p. 5.) 
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Where the ‘intuitionists’ posited innate mechanisms which determine 
our space perceptions, Helmholtz substituted a series of unconsciously 
learned experiences. . 


The psychic activities that lead us to infer that there in front of us at a certain 
place there is a certain object of a certain character, are generally not conscious 
activities, but unconscious ones. In their result they are equivalent to a 
conclusion, to the extent that the observed action on our senses enables us to 
form an idea as to the possible cause of this action; although, as a matter of fact, 
it is invariably simply the nervous stimulations that are perceived directly, 
that is, the actions, but never the external objects themselves.t 
As to the precise nature of these unconscious conclusions, Helmholtz does not 
. pretend to have an answer except to emphasise that they are the product 
of experience and interaction with the world. The most striking character- 

istic of these unconscious conclusions is that they exercise so total an 
effect on our consciousness that we are hardly aware of them. Helmholtz 
explains: 

... the only psychic processes involved ... are the involuntary ones connected 
with the association of ideas and with the voluntary flow of ideas which are not 
directly subject to our consciousness and our will; although, by making self- 
conscious ideas and aims concur with them, we can exert a certain influence on 
their course. For this very reason the effects of these ideas are so powerful as 
to be practically beyond our control, the will and the consciousness being 
confronted as if by some force of nature, exactly as in the case of the sensations 
that we obtain directly from outside. Thus, whatever is joined with the sen- 
sations in the results of psychic processes of this kind seems to us to be also the 
effect of an external agency, just as the immediate sensation itself is, and not 
something discovered by conscious free reflection, thought out by ourselves.* 


Interaction with the objective world is what determines our spatial 
perceptions through unconscious assimilation. But this unconscious 
learning is so powerful, the unconscious conclusions we draw so pervasive, 
that they are very close to the innate ‘forms of the intuitionists. In fact, 
since they are so powerful, the problem arises how one is able to dis- 
tinguish those parts of our perceptions which are dictated by innate 
mechanisms, and those parts which are the results of unconscious con- 
clusions exercising strong control on our perceptions. 

Helmholtz focuses on perceptual illusions in treating the problem of 
distinguishing between innate and learned perceptions. 


... a visual impression may be misunderstood at first, by not knowing how to 
attribute the correct depth-dimensions; as when a distant light, for example, is 
taken for a near one, or vice versa. Suddenly it dawns on us what it is, and 
immediately, under the influence of the correct comprehension, the correct 


1 Ibid., p. 430. (Southall, p. 4) j 2? Ibid., p. 804. (Southall, p. 541.) 
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perceptual image also is developed in its full intensity. Then we are unable to 
revert to the previous imperfect apperception. 

‘This is very common especially with complicated stereoscopic drawings of 
forms of crystals and other objects which come out in perfect clearness of 
perception the moment we once succeed in getting the correct impression.t 


From the instances of iflusions of the senses, where the actual perceptions 
can be changed by experience, Helmholtz formulated a principle: 
‘... nothing in our sense-perceptions can be recognized as sensation [innate] 
which can be overcome in the perceptual image and converted into its opposite 
by factors that are demonstrably due to experience.’ Applying this ‘principle’ 
to our perceptions of space, Helmholtz concluded that they are due to 
unconscious learning. 


Whatever, therefore, can be overcome by factors of experience, we must 
consider as being itself the product of experience and training. By observing 
this rule, we shall find that it is merely the qualities of the sensation that are 
to be considered as real, pure sensation; the great majority of space-apper- 
ceptions, however, being the product of experience and training.® 


The ‘intuitionists’ did not deny the reality of such instances of sensory 
illusions, but they interpreted the change of perception differently. Rather 
than positing an actual reordering of the perception for the anomalous 
situation, they found a temporary triumph of experience and the will over 
the innate forms which remain constant and unaltered. There was not the 
same possibility for change in their construction that there was for 
Helmholtz. 

The distinction between Hering’s ‘intuitionist’ theory of innate forms 
and Helmholtz’s ‘empirical’ theory, relying on learned forms is a subtle 
and technical one. As there was no physiological mechanism specified for 
Hering’s assumption of space feelings of retinal points, and since 
Helmholtz’s ‘unconscious conclusions’ were equally unspecified the argu- 
ment was reduced to a large extent to intricate experimental interpretations. 
It is outside of the realm of this paper to discuss the experiments used 
to argue one side of the case or the other.* Perhaps more important in 
understanding Helmholtz’s work as a whole however, is that adherence to one 
school or the other also involved a commitment in a metaphysical sense 
of the word, either to learned forms of our perceptions, or innately 
determined ones. Fully aware of-this, in discussing Hering’s theory in the 
section ‘Review of the Theories’, Helmholtz begins as follows: 

The first objection which I should have to offer, and which to my way of 
thinking seems absolutely insuperable, is that I cannot conceive how 2 single 


1 Ibid., pp. 436-7. (Southall, p. 12.) 
3 Ibid., p. 438. (Southall, p. 13.) SeIbid., p. 438. (Southall, p. 13.) 
t These arguments can be found in Helmholtz [1867], pp. 812 ff. (Southall, pp. 551 fÈ) 
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nervous stimulation can produce a completed idea of space without antecedent 
experience. However, I realize that this objection is probably of too metaphysical 
a nature to be considered on scientific grounds, and so I simply register it here 
for the benefit of those readers who share my view. , 


In Helmholtz’s mind it is clear that debie dogs not belong as part 
of the argument in a s¢ientific treatise. Hence he drops this point and 
goes on to a technical argument based on experiment. But in a number 
of mathematical researches he pursues the implications of his physio- 
logical theory in a more abstract way. Without offering it as part of his 
demonstration in his physiological work, Helmholtz nevertheless set out 
in mathematics to find the empirical facts which could lie at the basis of 
“our perceptions of space. Finding the facts which were analytically adequate 
and experientially universal enough to be the basis for our geometrical 
constructions would not prove his ‘empirical* thesis in physiology, but 
it would serve to bolster his metaphysical position with a plausible 
argument. It was important to Helmholtz that he succeed in finding the 
fact or small group of facts which could form the basis for geometrical 
axioms. 


2 Helmholtz’s first mathematical work ‘Uber die thatsachlichen 
Grundlagen der Geometrie’ (1866) is a short, general paper. Its themes 
were expanded and developed with greater mathematical precision in a 
second paper: ‘Uber die Thatsachen, die der Geometrie zum Grunde 
liegen’ (1868). Also in 1868, Helmholtz composed an addendum (Zusatz) 
to his 1866 work, correcting an important oversight concerning pseudo- 
spherical (hyperbolic) space. This addendum and its implications’ will 
be discussed in section 3 (p. 246). 

In these mathematical works, Helmholtz makes it clear that he began 
his investigations because of an interest, in space perception. 

The investigations about the manner in which localization takes place in the 
field of sight have induced the speaker [Helmholtz] to muse as well about the 
essential origins of universal space perceptions. First of all, there is a question 
here whose answer undoubtedly belongs in the exact sciences, namely this; 
which propositions of geometry express truths of factual meaning, which on the 
other hand are only definitions or the results of definitions and the particularly 
chosen expressions,” 


After he had begun but not yet published his investigations, he realised 
that Riemann’s work ‘Uber die Hypothesen, welche der Geometrie zu 
Grunde. liegen’? had covered much the same ground. There were significant 
1 Helmholtz [1867], p. 812. (Southall, p. 551.) * Helmholtz [1866], WA, p. 610. 


3 This article was Bernhard Riemann’s ‘G826-1866) Habilitationsvortrag delivered before 
the faculty of the University in Göttingen in 1854. It was not published until 1867. 
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differences between the two works, however, and Helmholtz found 
encouragement in the fact that his concerns had been treated by a pro- 
fessional mathematician of Riemann’s stature. In his 1868 paper he 
notes: . ° 


Furthermore, I must,also acknowledge that if through the publication of 
Riemann’s investigations the priority in reference to several of the results of 
my work is brought up, it was for me... of no little weight to see that such a 
superb mathematician had honored the same questions with his interest, and 
that it was an important guarantee to me of the correctness of the privately 
explored way, to come across him as a companion thereon. 


Certain parallels between Helmholtz’s work and that of Riemann are 
striking. The primary concern for both men was to clarify the assumptions . 
which lay at the base of our geometrical constructs, distinguishing between 
those which are logical cofisequences of any manifold, and those which 
narrowed this conception to the special case of three-dimensional space. 
There is a subtle difference between their approaches, however, which 
reveals a difference in their orientation. 

Riemann introduced the case as follows: 


...I have in the first place, therefore, set myself the task of constructing the 
notion of a multiply extended magnitude out of general notions of magnitude. 
It will follow from this that a multiply extended magnitude is capable of different 
measure-relations [Massverhdlinisse], and consequently that space is only a 
particular case of a triply extended magnitude.* 


He continues to say that it must be empirical facts which lead us to choose 
a particular space from among the many alternatives which fit the most 
general notion of a three-dimensional magnitude. 


But hence flows as a necessary consequence that the propositions of geometry 
cannot be derived from general notions of magnitude, but that the properties 
which distinguish space from other conceivable triply extended magnitudes are 
only to be deduced from experience.’ Thus arises the problem, to discover the 
simplest matters of fact from which the measure-relations of space may be 
determined; a problem which from the nature of the case is not completely 
determinate, since there may be several systems of matters of fact which suffice 
to determine the measure-relations of space... These matters of fact are—like 
all matters of fact—not necessary but only of empirical certainty; they are 
hypotheses.® 


Hence, by his abstraction and the development of the idea of manifolds, 

one of Riemann’s primary concerns was to find what elements of our 

geometrical construction are hypotheses abstracted from experiential 
1 Helmholtz [1868], WA, p. 619. 


? Riemann [1867], pp. 133-4. (Clifford trans., p. 14.) 
3 Ibid., p. 134. (Clifford, pp. 14-15.) 
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data, and what parts are necessary components of a manifold. His emphasis 
in viewing these ‘facts’ is on their character as hypotheses, not as experi- 
ences. In choosing among the many possible ‘matters of fact’ which could 
determine our measure relations, his interest is in Jhose whose abstractions 
are the ‘simplest’ hypotheses, not in deciding on the experiential level 
which might in some way be more basic. The titfe ‘of his article, ‘Uber 
die Hypothesen, welche der Geometrie zu Grunde liegen’ underlines this 
concern with hypotheses. 

Helmholtz was also interested in distinguishing between the logically 
necessary elements of our geometry, and those which were abstractions 
from experience. He parallels Riemann’s approach.to this problem through 

"abstract analytic geometry. He states the problem as follows: 


. we can only concretely conceive such spatial relations as can possibly be 
represented i in real space; this perception [Anschaulichkeit] thus easily entices 
us into naturally presupposing something that in truth is a particular and not 
self-evident property of the external world that lies before us.t 


Helmholtz turns to the abstraction of analytic pani in EM to cope 
with this problem. 


This difficulty i is avoided in analytical geometry in which one reckons with pure 
concepts of quantity and in which proofs need no perception [Anschaulchkeit]. 
Thus in order to arrive at an answer to the question raised one might 
engage in a search for those analytical properties of space and those spatial 
quantities that must be assumed within analytic geometry in order to fully 
ground its theorems from the start.? 


Certainly the points of departure of the two men were similar. There 
is an essential point of difference, however, in the basis for choosing among 
the ‘matters of fact from which the measure-relations of space may be 
determined’ (cf. p. 242). Riemann made his choice in favour of the 
logically simplest hypotheses. Helmholtz, on the other hand, was more 
concerned with the facts which determine experience, with the unconscious 
learning fundamental to sensation. He chose from among various ‘matters 
of fact’ those that he found most necessary to the ‘unconscious con- 
clusions’ governing space perception. 

Riemann and Helmholtz were agreed on the fundamental importance 
of measure relations in determining our space. The basic difference 
between their views lay in their approaches to this determination. 

Riemann approaches the question analytically through abstract notions 
of quantity. Only later does he interpret his results geometrically in order 
to apply them to space. 


1 Helmholtz [1866], WA, p.*6r1. 2 Ibid., p. 611. 
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These measure-relations can only be studied in abstract notions of quantity, 
and their dependence on one another can only be represented by formulae. 
On certain assumptions, however, they are decomposable into relations which, 
taken separately, are capable of geometric representation; and thus it becomes 
possible to express geometrically the calculated results. 


With this as his intrdduction, Riemann begins his investigation by assuming 
the independence of the length of one dimensional line elements from 
position, an assumption assuring that each line element is measurable 
by every other. Using this assumption he investigates the possible mathe- 
matical expressions which could be used to measure the length of a line. 
As the simplest case he arrives at the formula ds = »/(dx)*,.the usual 
measure relation for Euclidean space. ' 

In the final section of his paper entitled ‘Applications to Space’ 
Riemann explores what canditions are ‘necessary and sufficient to deter- 
mine the metric properties of space.’* Here he shows in what ways our 
interpretations of space might be modified according to particular experi- 
mental investigations, in particular in the realm of the very small. 
Ultimately, he concludes that such investigations belong to the realm of 
physics, however, stating: : 
Researches starting from general notions, like the investigation we have just 
made, can only be useful in preventing this work [of physicists] from being 
hampered by too narrow views, and progress in knowledge of the interdepend- 
ence of things being checked by traditional prejudices.3 
Riemann saw his work as a possible liberation of physics and science in 
general from unnecessary, binding preconceptions. 

In contrast, Helmholtz was primarily concerned with the space of 
everyday life. The introduction of analytic geometry was as a tool, to 
distinguish those aspects of our space which were not logically necessary 
from those which were. Like Riemann, he based his argument on the 
idea of the n-dimensional manifold: In defining the measure relation on 
the manifold, Helmholtz diverges from Riemann’s analytic approach. He 
posits the possibility of the free movement of rigid bodies in space (as 
opposed to that of one dimensional line elements) as his starting point. 
Using abstractions from our experience of the free movement of rigid 
bodies, he is able to mathematically derive Riemann’s distance formula, 
ds = 4/(dx)*. . 

The basis for Helmholtz’s choice of free movement as his fundamental 
concept is the concern with the experiences underlying our spatial 
perceptions, which he had carried over from his physiology. He made 





1 Riemann [1867], p. 138. (Clifford, p. 15.) 
3 Ibid., p. 146. (Clifford, p. 36.) . 3 Ibid., p. 150. (Clifford, p. 37.) 
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this choice because of the relationship of movement to ideas of congruence, 
which he felt to be fundamental to our perception of space. 


However, in general one cannot speak of congruence, if rigid bodies or point 
systems cannot be moved to another place without change of form, and if 
congruence of two spatial magnitudes is not a datum which remains independent 
of all movements. The possibility of space measuremeft ‘through the establish- 
ment of congruence, I thus presupposed from the beginning and set myself 
the problem to look for the most general analytic form of a multiply determined 
manifoldness in which the required form of movement is possible.4 


That congruence, rather than the equation of the shortest distance 
between two points, is fundamental to Euclidean space, was a choice he 
- made based on experiential rather than analytic criteria. 


My point of departure was, that all initial measurement of space is based on 
the observation of congruence; but the property of the light ray as a straight 
line is obviously a physical fact, which is supported by particular experiments 
from another area, and which, for the blind, who can also gain full conviction 
of the accuracy of geometrical axioms, has absolutely no weight.? 


Helmholtz found Riemann’s analytic approach to measure determination 
to be unsatisfactory because it did not reflect the experientially necessary 
part of our conception of space. The blind, he emphasised, can understand 
geometry without realising the straightness of light, but not without 
understanding congruence (through their sense of touch); hence 
ds = 4/(dx}? is not fundamental in the way that the congruence of rigid 
bodies is. Riemann’s derivation of this as the simplest form of the analytic 
expression to which his assumptions led him is one that Helmholtz did 
not dispute. But it did not satisfy Helmholtz’s concern, which was, with 
the experiences which determine our geometric constructs. Thus, within 
mathematics, Helmholtz’s results are dictated by his physiological aim: 
to find the facts which are fundamental to our experience of space, and 
mathematically sufficient to determine’it, the basis for our ‘unconscious 
conclusions’. 





3 In the congruence of rigid bodies, and the attendant concept of their 
free motion in space, Helmholtz thought he had specified the experiences 
which determine our space. With the condition that space be three 
dimensional and infinite (unendlich) he felt that the transform characterising 
the motion of rigid bodies determined space as strictly Euclidean. 

.. it follows further... that if the number of dimensions be determined as 


three and the infinite extension of space is assumed, that no other geometry is 
possible, than that taught by Euclid. 


2 Helmholtz [1868], WA, p. 621. à 
2 Ibid., p. 621. 3 Helmholtz [1866], WA, p. 615. 
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He believed he had succeeded in generating Euclidean geometry from 
ideas as commonsensical as those of the traditional axioms, yet avoiding 
the difficulties which were inherent in the formulation, in particular, the 
parallel postulate. . 5 


Thus, the system of thege postulates makes no assumptions which the ordinary 
form of geometry does not also make; it is complete dnd sufficient even without 
the special axioms about the existence of straight lines and planes and without 
the axiom of parallel lines. 


The illusion that congruence was both experientially and analytically 
adequate to define our spatial perceptions was soon shattered, however, 
by the appearance of-Eugenio Beltrami’s papers in 1867 dnd 1868 
describing a pseudospherical geometry. In these works Beltrami showed- 
that there was a way of interpreting space of constant negative curvature 
on a bounded area, through Euclidean geometry. Helmholtz was quick to 
recognise his oversight, and added an 1868 addendum to his 1866 paper 
stating simply that he had overlooked the possibility of Beltrami’s space 
of negative curvature, and admitting it into his construction. 


The claim put forward there [in the 1866 paper] that space, if it be extended 
infinitely [unendlich], must necessarily be flat (in Riemann’s sense) [measure 
of curvature= o] is false. 

In particular, this is a consequence of the highly interesting and important 
researches of Mr. Beltrami... in which he investigates the theory of planes and 
space of constant negative curvature and has demonstrated its correspondence 
with the imaginary geometry advanced yet earlier by Lobatschewsky. In this 
geometry, space is infinite [unendlich] in all directions; figures which are con- 
gruent to a given figure can be constructed equally in all their parts; between 
any two points only one shortest line is possible, but the axiom of parallel lines 
does not prove to be correct.? 


In his second paper, first published in 1868, this revision, including 
pseudo-spherical space in his construction, is made in footnotes. 
Including a non-Euclidean geometry in this way in his basic experiential 
formulation of geometrical axioms raises a very basic question for 
Helmholtz: what is it in our experience which determines our space as 
Euclidean and not pseudo-spherical? Congruence and the motion of rigid 
bodies do not go far enough in narrowing the possibilities to specify 
Euclid’s geometry. Helmholtz’s philosophical discussion of the axioms of 
geometry is an attempt to deal.with this problem. It is largely contained 
in two articles: “The Origin and Meaning of Geometrical Axioms’ (1876) ® 
1 Thid., p. 616. 3 Tbid., p. 615. 
3 The dating of this article is somewhat confusing. It was frst delivered in German in 
Heidelberg in 1870. It is printed in Hermann von Helmholtz, Vorträge urd Reden, 
(1884), II, pp. 3-31. In the discussion of the issues raised here, most people refer to 


the article as it appeared in English in Mind, 1876, pp. 301-21. It was the article as it 
appeared in English which triggered Land’s response ([1877]). 
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and ‘On the Origin and Meaning of Geometrical Axioms (II) (1878). 
The second article is an expansion and elaboration of some of his ideas 
in an attempt to answer some of the criticisms that the first article 
occasioned. s 
In bis 1876 article “The Origin and Meaning of Geometrical Axioms’ 
Helmholtz reintroduces the axiom system he had abstracted from the 
free motion of rigid bodies in space, indicating that Beltrami’s pseudo- 
spherical space, as well as Euclidean space, is possible in this construction. 
In this paper, Helmholtz’s aim is to carry this one step further, into the 
realm of spatial perception: to show that a pseudo-spherical space can 
_ be imagined by beings with our perceptual faculties. He defines his goals 
‘in this direction as follows: 
By the much abused expression ‘to represent’ or ‘te be able to think how some- 
thing happens’ I understand—and I do not see how anything else can be 


understood by it without loss of all meaning—the power of imagining the whole 
series of sensible impressions that would be had in such a case. 


He then describes what would be our visual impressions in a pseudo- 
spherical world, concluding his attempts as follows: 


These remarks will suffice to show the way in which we can infer from the 
known laws of our sensible perceptions the series of sensible impressions which 
a spherical or pseudo-spherical world would give us, if it existed. In doing so 
we nowhere meet with inconsistency or impossibility any more than in the 
calculation of its metrical proportions. We can represent to ourselves the look 
of a pseudo-spherical world in all directions just as we can develop, the con- 
ception of it. Therefore it cannot be allowed that the axioms of our geometry 
depend on the native form of our perceptive faculty, or are in any way connected 
with it.? 
Our adherence to Euclidean geometry is not due to our universal experi- 
ence of the congruence of rigid bodies. This experience is not enough to 
narrow the possibilities to make them uniquely Euclidean. 

This realisation forced Helmholtz to take a major step further in 
specifying the experiential basis for our geometry. He appends the 
principles of mechanics to the basic fact of congruence. 


If it were useful for any purpose, we might with perfect consistency look 
upon the space in which we live as the apparent space behind a convex mirror 
with its shortened and contracted background} or we might consider a bounded 
sphere of our space, beyond the limits of which we perceive nothing further, 
as infinite pseudospherical space. Only then we should have to ascribe to the 
bodies which appear as solid... corresponding distentions and contractions, 
and we must change our system of mechanical principles entirely; for even the 
proposition that every point in motion, if acted upon by no force, continues 


1 Helmholtz [1876], p. 304. 2 Ibid., p. 318. 
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to move with unchanged velocity in a straight line, is not adapted to the image 
of the world in the convex mirror. The path would indeed be straight, but the 
velocity would depend upon the place. 

Thus the axioms of gepmetry are not concerhed with space-relations only 
but also at the same time with the mechanical deportment of solidest bodies 
in motion.? à i 
By appending the principles of mechanics to the foundations of geometry 
Helmholtz has taken a bold step, for it makes our Euclidean geometry 
dependent on conscious experience subject to proof or disproof by 
experiment. ; 

This is a powerful and new contention. The emphasis on facts already 
discussed in Helmholtz’s works was directed towards the facts underlying. 
our concepts, which many held to be innate. In showing that there was 
a factual interpretation fôr perceptual forms, he could strengthen his 
contention that rather than being innate they were subconsciously learned. 
But this aspect of the problem was a difficult one for which he had no 
firm demonstration. In these later papers, he was willing to concede it to 
his strongly Kantian colleagues. 


The notion of rigid geometrical figure might indeed be conceived as trans- 
cendental in Kant’s sense, namely, as formed independently of actual experience, 
which need not exactly correspond therewith, any more than natural bodies 
do ever in fact correspond exactly to the abstract notion we have obtained of 
them by induction. Taking the notion of rigidity thus as a mere ideal, a strict 
Kantian might certainly look upon the geometrical axioms as propositions given 
a priori by transcendental intuition which no experience could either confirm or 
refute, because it must first be decided by them whether any natural bodies 
can be considered rigid.? 


This done he could better emphasise the strength of his second contention: 
Euclidean space is an empirically based concept. Even if the forms he 
had originally emphasised as experiential were treated as being a priori, 
they were inadequate to determine Euclidean space, and our system of 
geometry. 


1. The axioms of geometry, taken by themselves out of all connection with 
mechanical propositions, represent no relations of real things. When thus 
isolated, if we regard them with Kant as forms of intuition transcendentally 
given, they constitute a form into which any empirical content whatever will 
fit and which therefore does not in any way limit or determine beforehand the 
nature of the content. This is true, however, not only of Euclid’s axioms, but 
also of the axioms of spherical and pseudospherical geometry. 

2. As soon as certain principles of mechanics are conjoined with the axioms 
of geometry we obtain a system of propositions which has real import, and which 
can be verified or overturned by empirical observations, as from experience it 


1 Ibid., p. 319. 3? Ibid., pp. 319-20. 
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can be inferred. If such a system were to be taken as a transcendental form of 
intuition and thought, there must be assumed a pre-established harmony 
between form and reality.1 i 


These views of Helmholtz were much more fadical than any he had 
espoused before. In his earlier pursuits he was simply showing that it was 
a plausible suggestion that spatial perceptions be determined by sub- 
consciously learned experiences. Here he is willing to pass over that point 
to reach the more important one, that Euclidean geometry requires the 
addition of the principles of mechanics to the ideas innately given (or 
unconsciously learned). These principles are not transcendentally given or 
. subconsciously learned, they are consciously determined. Euclidean 
` geometry becomes a physical science. 

This interpretation of geometry resulted ing great deal of philosophical 
discussion and disagreement. One critic went so far as to contend that if 
Helmholtz were correct, the great tradition of German philosophy was 
dead and all German youth would have to be educated in England in the 
future]? Less emotional was J. P. N. Land, whose article ‘Kant’s Space 
and Modern Mathematics’ appeared in Mind in 1877. In the same way 
that mathematicians had been critical of his mathematical work, Land 
chided Helmholtz for treating the problems of philosophy in an un- 
acceptable and non professional way. Land prefaces his points by saying: 

Before attempting to answer this argument, let me briefly point out a funda- 
mental error that appears to hinder many adepts of positive science from 
realising the true nature of problems belonging to the theory of knowledge, or 
critical metaphysics. 

The ‘fundamental error’ to which he is here alluding is the failure to 
maintain a distinction between ‘objectivity’ and ‘reality’. 

In our wanderings on the border between science and philosophy we are apt 
to forget that it is impossible to move on both sides of the boundary line at once, 
and that whoever crosses it shifts his problem as well as his method.... We 
admit a real world, in physics independent of all appearance to anybody’s sense 
or reason, and take for its exact counterpart the world that offers itself to the 
mens sana in corpore sano after exhausting all the means of research at the com- 
mand of mankind. Science has no suspicion [sic], of a distinction between 
‘objectivity’ and ‘reality’.4 
Specifically he argues against Helmholtz’s‘contention that Euclidean space 
is an empirically formed conception: 


Our intuition of space may be empirical without a real space to correspond, 
provided there be any reality whatever compelling the mind to exert its native 


1 Ibid., p. 321. z © 3 Krause [1878], Vorwort’. 
* Land [1877], p. 38. 4 Ibid., pp. 38-9. 
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powers in constructing space as we know, which the mind would not do unless 
so compelled. In that case space, Euclid’s space, would remain a form of 
intuition, a priori and transcendental.? 


Here Land maintained that we can never really be sure in what way the 
world we perceive a8 ‘objective’ relates to the ‘real’ world. Therefore, he 
continues, whatever ‘reality’ might be acting on our minds calls forth a 
Euclidean view of space. There is no guarantee that what we perceive 
as Euclidean space is in some essential way so. 

The crux of this argument rests on the concept of ‘imaginability’. If 
Helmholtz were correct and non-Euclidean spaces were truly ‘imaginable’, 
our perceptions triggered by an unknown ‘reality’ could possibly be. 
different. We would be able to conceive of a different series of perceptions ` 
which would lead us to accept a different model of ‘reality’. If we could 
truly ‘imagine’ a non-Euclidean space, then Land’s argument for the 
innate nature of Euclidean space is in serious doubt. Land therefore 
focuses his argument on Helmholtz’s use of the idea ‘imaginable’. 

To Helmholtz, the ability to describe a situation and describe the series 
of sensations we would have were we in that situation constitutes the 
ability to ‘imagine’. (cf. p. 247.) Land criticises Helmholtz for this use 
because he finds it to be fundamentally different from the Kantian one. 


In the present case, the first question is whether any sort of space besides 
the space of Euclid be capable of being imagined. More than three dimensions, 
it is allowed, we are quite unable to represent. But we are told of spherical and 
pseudospherical space, and non-Euclideans exert all their powers to legitimate 
these as space by making them imaginable. We do not find that they succeed in 
this, tnless the notion of imaginability be stretched far beyond what Kantians 
and others understand by the word.? 


Specifically, he does not accept the mathematical illustration of non- 
Euclidean spaces as sufficient to render them imaginable. 


And when we are assured that Beltrami has rendered relations in pseudo- 
spherical space of three dimensions imaginable by a process which substitutes 
straight lines for curves, planes for curved surfaces, and points on the surface 
of a finite sphere for infinitely distant points we might as well believe that a 
cone is rendered sufficiently imaginable to a pupil by merely showing its pro- 
jection upon a plane as a circle or a triangle. . . . As a form of the objective world, 
which remains the same from whatever point we inspect it, we can imagine, 
not any space in which motion implies flattening or change of form of any kind, 
but only the space known from our sense-experience, the space of Euclid. 
All other ‘space’ contrived by human ingenuity may be an aggregate with 
fictitious properties and a consistent algebraical analysis of its own, but space 
it is called only by courtesy.’ 


1 Ibid., pp. 42-3. 3 Thid., p. 41. 3 Ibid., pp. 41-2. 
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This presentation of imaginability does not really address the issues which 
concerned Helmholtz. It takes the form of a dogmatic statement based 
on tradition, rather than dealing with the implications of Helmholtz’s 
argument. e 

Helmholtz wrote “The Origin and Meaning of Geometrical Axioms (II)’ 
primarily in response to Land’s article. He responds to the criticism of 
his definition of ‘imaginability’ by elaborating on his former definition. 

I advanced a definition which was to the effect—that for this [imaginability] 
we need the power of fully representing the sense-impressions which the object 
would excite in us according to the known laws of our sense-organs under all 
, conceivable conditions of observation, and by which’ it would be distinguished 
- from other similar objects. I am of opinion [sic] that this definition contains 
stricter and more definite requirements for the possibility of imagination than 
any previous one, and, as far as I can see, Profs Land does not contend that 
these requirements cannot be satisfied for objects in spherical or pseudospherical 
spaces.1 
As Land’s concept of imaginability is never positively defined, beyond 
reiterating and attempting to clarify his position, Helmholtz cannot further 
argue his point. 

In response to Land’s other criticism, that he did not adequately 
distinguish between ‘objectivity’ and ‘reality’, Helmholtz proceeded to 
draw such a distinction for the sake of argument. 

If...it is still assumed hypothetically that the axioms are really given 
a priori as laws of our space-intuitions, two kinds of equivalence of space- 
magnitudes must be distinguished: a) Subjective equality given by the hypo- 
thetical transcendental intuition; b) Objective equivalence of the real substrata of 
space relations, proved by the equality of physical states or actions, existing 
or going on in what appear to us as congruent parts of space. The coincidence 
of the second with the first could be proved only by experience; and as the second 
would alone concern us in our scientific or practical dealings with the objective 
world, the first, in case of discrepancy, must be discounted as a false show.” 
Helmholtz’s orientation in this discussion is reminiscent of his position 
in relation to the physiological ‘intuitionists’. In both cases his essential 
objection is to positing barriers standing between human perception or 
thought and experiential reality. In the case of the assumption of a 
‘transcendental geometry’ he sums up his position as follows: 

The assumption of a knowledge of axioms by transcendental intuition apart 
from all experience is a) an unproved hypothesis, and b) an unnecessary hypothesis, 
since it explains nothing in our actual knowledge of the outer world that cannot 
equally .be explained without its help...c) a wholly irrelevant hypothesis, 
since the propositions it includes can be applied to the relations of the objective 
world only after their objective validity has first been independently proved.® 

1 Helmholtz [1878], p. 215. ? Ibid., pp. 212-13. > Ibid., p. 225. 
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The arguments in this quotation are plainly those of a man primarily 
concerned with experimental science. Within this framework, Helmholtz 
concluded that there was no need for the assumption of a transcendental 
geometry. j 
4 In conclusion; it is clear that both in mathematics and philosophy 
Helmholtz was pursuing the implications of his own interest in human 
perception. In mathematics his work was similar to Riemann’s, but this 
was only by chance; he did not know of Riemann’s researches until after 
he had begun his own. His work did not grow out of a mathematical 
tradition, though it meshed neatly with the growing interest in non-. 
Euclidean geometries at the time. i 

It did not go unnoticed. Mathematical professionals recognised the 
technical flaws in his work, which they attributed to his non-expert status. 
Thus, referring to the initial failure to incorporate pseudo-spherical space 
into his construct, Felix Klein is quoted by Bertrand Russell as saying: , 
‘Helmholtz is not a mathematician by profession, but a physicist and 
physiologist.... From this non-mathematical quality of Helmholtz, it 
follows naturally that he does not treat the mathematical portion of his 
work with the thoroughness which one would demand of a mathematician 
by trade.’! Sophus Lie also criticises Helmholtz with the somewhat sour 
comment that ‘the results contained in it [Helmholtz’s work] can hardly 
be considered as proven.’? These criticisms were directed against the 
technicalities of Helmholtz’s treatment, however. His basic orientation 
within mathematics was given serious consideration. Developments of the 
foundations of geometry through the group of rigid body transforms 
were rigorously pursued, notably in Lie’s work of 1890, Uber die Grund- 
lagen der Geometrie. 

Further, Helmholtz’s basic apptoach to geometry through physiology, 
maintaining the close tie between perceptual and geometric space, was 
one which continued to find expression. In 1902, when Hilbert’s strictly 
analytic Grundlagen der Geometrie was reviewed by Poincaré, Poincaré 
criticised it for being incomplete: 

I regret that in this treatment of the metric axioms there is no trace of the 
notion of which Helmholtz was the first to recognize the importance: I refer 


to the displacement of rigid bodies. One could have kept this notion in its 
natural role without sacrificing the logical character of the axioms.® 


Clearly here, a purely logical construction of geometry is not seen to be 
adequate. Helmholtz’s efforts to tie geometrical ideas to perceptual space 
through the free movement of rigid bodies are still seen as germane. 

1 Russell [1897], p. 25. * Lie [1890], p. 2. ? Poincaré [1902], p. 255. 
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In both mathematics and philosophy, the basic characteristic of Helm- 
holtz’s contribution is his empiricism. It was a concern with the experiences 
underlying perception which led him into mathematics, and which 
determined the axiom system he developed fdr his work. In his later, 
more philosophical works, the relation between our perceptions and the 
unexpected non-Euclidean geometry constitutes his major preoccupation. 
Helmholtz’s mathematical and philosophical papers exhibit strong con- 
tinuity both in the questions he explored and the attitudes he brought to 
bear. They are consistently motivated by an interest in human perception, 
and universally reveal the commitment to empiricism found first in his 
physiological studies. Helmholtz was able to bring new insights to bear, 
` which led to original results. But considered from his own viewpoint, 
what might seem like startling versatility and range of research is really 
a reflection of consistent interest in a particular problem, defined by 
himself. 


Harvard University 
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The Role of Statistical Mechanics in 
Classical Physics* 


by DAVID LAVIS 


INTRODUCTION 


There exists a class {7} of systems, any member ¥ of which can be 
". regarded on the one hand as a mechanical system £‘* and on the other 
as a thermodynamic system PD, The mechanical system J% is charac- 
terised by its classical or quantum mechanical state which evolves with 
time according to the appropriate differential equation (Hamilton’s 
equation or Schrédinger’s equation respectively). The thermodynamic 
system ‘P is characterised by the assumption that, if undisturbed for a 
sufficiently long period of time, it will attain a state called its equilibrium 
state: Such an equilibrium state is specified by a small set of parameters, 
these being interrelated by the laws of thermodynamics. The primary réle 
of statistical mechanics is to interpret the parameters of JC and to 
derive the relationships between them in terms of the properties of %9, 

We may enquire whether this task could be performed only by a theory 
with a probabalistic component. One apparently obvious affirmative 
answer to this question would be of the following kind: ‘Some proba- 
balistic component is necessary because of the quantum nature of niatter. 
Probability theory is involved because of its essential rôle in quantum 
theory’. Such an answer is inadequate on two counts. First because of 
the existence and relative success of classical statistical mechanics and 
secondly because of the dual rôle ‘of probability theory in quantum 
statistics. From the point of view of this discussion quantum considerations 
are irrelevant and we shall henceforth suppose SOV to be a classical 
mechanical model. The question of the necessity or otherwise of statistical 
mechanics remains open and will arise in the course of the discussion. 

It must be emphasised at the outset that the aim of this paper is to give a 
possible rational reconstruction of the current programmes of statistical 
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mechanics. It is necessary to stress the motive of rational reconstruction 
because the development of statistical mechanics has been such that a 
survey entirely in terms of the declared views of writers in the field would 
on the one hand be confysing and on the other leave many gaps. Certain 
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terms (e.g. ‘ensemble’) are used, as we shall see, in different senses by 
different writers. It is also the case that many of those most concerned 
with the development of statistical mechanics have left certain key-aspects 
of their programmes implicit or ambiguous. This is particularly so with 
respect to their views on probability theory. The attempt has been made 
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therefore to construct the possible routes from mechanics to thermo- 
dynamics in terms of concepts rather than personalities. To aid this task 
the material of the discussion has been divided into four levels as shown 
in the diagram. Level 1 contains classical mechanics together with the 
two properties which would seem, separately or together, to characterise 
those mechanical systems which give rise to thefmodynamic behaviour. 
Level 2 contains the philosophical positions with respect to probability 
theory which would seem to provide sufficient support for the construction 
of statistical mechanics. Level 3 contains most of the disputed aspects of 
the derivation of statistical mechanics. It is in itself a mainly mathematical 
level but the choices between different approaches within it are to some 


, extent determined by the philosophical considerations arising from level 


2." Level 4 is common ground for all workers in statistical mechanics. 
There are differences in the degree of mathematical sophistication with 
which the material is manipulated, but there is little doubt that the 
microcanonical and canonical distributions are a necessary part of equi- 
librium statistical mechanics.? The arrows in the diagram represent my 
personal opinion as to the possible rational routes from level 1 to level 4. 
Part of the purpose of this paper is to attempt to expose the weaknesses 
of these various routes. Since levels 1 and 4 present the least problems 
and since they represent the beginning and end points of all the pro- 
grammes under discussion we shall consider them first, after which we 
shall deal with the problematic areas represented in levels 2 and 3. 


LEVEL I 


Let S be a stationary Hamiltonian system of N identical microsystems 
(gas molecules, dipoles etc.) each with y degrees of freedom (translational, 
rotational etc.). The macroscopic dimension of the system is given by one 
extensive parameter 40,4 The system: has the following properties: 


(t) The state of the system at time ¢ is given by the phase vector 


(PC), aC) = (Pa(ë), Palé), - + -> PxC), A(t), Aa(#), - - » Gn(4)) which specifies 
a point in the 27N-dimensional phase space I, pi(t) and q(t) being 


1 I agree with the view of Hobson ([1971], p. 33) that the subjective theory of Ramsey and 
de Finetti has no relevance to the physical sciences. The grouping of other points of view 
into two traditions, labelled ‘logical’ and ‘scientific’ follows the classification of Gillies 
[1973] except that he includes the subjective position within the logical tradition. 

3 Or, in the case of the ergodic method, by the attempt to by-pass level 2. 

3 This remark should not be taken to imply that no alternative link between mechanics 
and thermodynamics, avoiding statistical mechanics altogether, is possible (see p. 259, n. 2). 

4 For the sake of simplicity we consider only identical microsystems and only one extensive 
parameter. In general there may be more than one type of microsystem and more than 
one extensive parameter. Typical extensive parameters are the volume and the magnetic 
and electric moments of the system. 
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respectively the momentum and configuration vectors of the ith micro- 
system. 

(#) There exists a continuous differentiable Hamiltonian function H on 
T™ and the evolution of the phase point in T satisfies Hamilton’s equation 


s (P, q) = {= Vo V,)H(P, a). (1) 


It is convenient, although not essential, to suppose that the value of the 
Hamiltonian is the total energy of the system. 

Tt is a standard result of the theory of differential equations that, given 
a point (Po, qo) in T, there exists a solution of Hamilton’s equation which 
determines a unique path of the system through (Po, qo). The phase point , 
of the system moves along the path, passing through (Po, qo) at time t = o. ' 
We have therefore a set of continuous mappings of I into itself para- 
metrised by ż. This set is called a Hamiltonian flow. An integral g of the 
equation of motion is a function of (p,q) and £ whose value remains 
unchanged along a path. The equation to be satisfied by g is 


— VH . Vagt YH. Vg =o. (2) 


A time independent integral is called a constant of the motion. The equation 
of motion will have (27N—1) local constants of motion (constants which 
remain unchanged in value at least for limited periods of time and for 
certain regions of I’). Of these there will be some which are constant 
throughopt phase space and for all time; these are called global. A global 
constant of motion is called isolating if it defines a hypersurface in phase 
space. For a fluid system, consisting of particles which undergo perfectly 
elastic collisions at the boundaries of the container, the components of 
the total linear momentum will be local, but not global, constants. The 
Hamiltonian will however represent an isolating constant. A path of the 
system with at least one point 6n a particular energy hypersurface 
Zp = {(p, q): H(p, q) = E} will lie entirely in that hypersurface. 

In defining the mechanical system JOO we have simply outlined the 
general characteristics of a stationary Hamiltonian system. It is usually 
argued, however, that there is some additional feature peculiar to those 
kinds of mechanical systems which can be linked to thermodynamic 
systems, even though there is no general agreement on the nature of this 
feature. According to Grad [1967]: “The single feature which distinguishes 


1 To be accurate we have of course a set of Hamiltonians parametrised by N and A‘), 
On these we impose the condition that H/N is finite in the thermodynamic limit N=, 
AW)» co, NJAAD finite. A detailed analysis of the mathematics involved in taking the 
thermodynamic limit is given by Ruelle [1969]. 
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statistical mechanics from ordinary mechanics is the large number of 
degrees of freedom’. On the other hand there are those (e.g. Jaynes [19 57} 
Hobson [1971]) who would deny that statistical mechanics is exclusively 
concerned with large systems. For them it ‘is the study of incompletely 
specified systems’ (Hobson [1971], p. 2). =" 


LEVEL 4 


For the thermodynamic system £‘” the variables can be divided into 
two sets. There are the independent variables which, in an experimental 
situation’ are directly controlled by the experimenter. The remaining 


‘dependent variables will, for fixed values of the independent variables, 


change their values until they reach an equilibrium state, after which they 
remain constant. The thermodynamic system’ StD, at equilibrium, has 
the following properties: 

(i) If undisturbed all variables of the system maintain constant values. 

(#) The macroscopic dimension of the system is given by one extensive 
parameter A‘” and there exists a ‘generalised force’ FO such that an 
increment of work dW, performed on the system by its environment, as 
it moves through a succession of equilibrium states increasing A‘” by 
dA‘), is given by dW = FdA. 

(ïi) The thermodynamic energy U, the entropy S and the absolute 
temperature T are defined. Any incremental change of state (dU, dS, dA‘) 
for which the system moves through a succession of equilibrium states 
must satisfy the fundamental thermodynamic relationship 

aU = TdS +F PdA. ` (3) 

By including a discussion of the microcanonical and canonical distribu- 
tions in level 4 (see diagram) we introduce, at this level, the mathematics 
of what is usually referred to as ‘equilibrium statistical mechanics’. I 
argue that this is legitimate for two reasons: 


(1) Although this work can be, and is, formulated in a variety of 
different ways, it is common to all attempts to relate SOD and (3 

(2) The statistical connotation imposed on the mathematical structure, 
in most texts and on some interpretations, is not integral to it.® 


1%f A'D is the volume then F'T? is minus the pressure exerted by the environment; if 
A'T! is the magnetic or electric moment then FT) is the corresponding field. Again we 
have restricted the discussion to systems with only one extensive parameter. We shall 
also refrain from discussing systems which exchange matter with their environment. 

? To my knowledge, the only exception to this is the attempt to justify (some parts of) 
thermodynamics using the methods of molecular dynamics (see e.g. Alder and Wainwright 
[1959], [1960], [r962)). 

2 I shall argue that it is one of the aims ef the ergodic method to eliminate statistical 
connotations. 


R 
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In other words what I wish to do, at this level, is to delineate those 
parts of the mathematics which do not depend on philosophical pre- 
suppositions. The questions on which there is real disagreement are 
contained in level 3. We now list the problems which, in a formal sense, 
will be resolved at thig level. We need 
(Px) A definition of equilibrium in terms of the properties of 7, 

(P2) A connecting link between the macroscopic properties of S% and 
Fn, 

The method of solution of these problems is on the following lines. Let 

p be a function of (p, q) and t which defines a measure on I’. The measure 

M[y;] of the set y, in T at time t is given by 


Mid = | Tap, 49. i) 


7t 
We suppose that the measure is normalised (i.e. M[I"] = 1) and that the 
measure of any set y; as it moves with the Hamiltonian flow is preserved. 
It is not difficult to show! that a necessary and sufficient condition for the 
measure density function to be that of a preserved measure is that it 
satisfies Liouville’s equation 


E — V, . (PYsH)+ V, . (PVH) = 02 (5) 


We now consider the extended dynamic system {p, SUP} consisting of 
FOO together with the normalised measure preserving density function p. 
It is in terms of this extended system that Pı and P2 are solved. The 
solutions are as follows: 


(S1) The system is in equilibrium if the measure density function is not 
an explicit function of time. 

(S2) Related to any property X‘? of the thermodynamic system J, 
with the exception of the entropy S and the absolute temperature 
T, there is for {p, 7} a phase function X9., The variable X‘”? 
is given by the mean value of X‘” with respect to p, t.e. 


xm = f dF p(p, a)X(p, a). (6) 
r 
At equilibrium the thermodynamic entropy is given by 
fk | aT'o(p, 4) ln {e(N)o(P, a} (7) 
r 


1 The mathematics is the same as that used to prove the hydrodynamic continuity equation, 

1 Since (—V,.H, V,H) is an irrotational vector field Liouville’s equation is identical to 
its adjoint, which is equation (2), the equatjon satisfied by integrals of the equation of 
motion. 
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where k is Boltzmann’s constant and c(N) is the Boltzmann correcting 
factor.+ 


If, as we assume, the value of the Hamiltonian H is the total energy of 
the system, then the phase function corresponding to the thermodynamic 
energy U is the Hamiltonian. With respect to the = of variables FP 
and A‘? two situations’ can arise: 

(a) If the system is mechanically isolated then A‘” is an independent 
variable, the corresponding phase function A“ is a constant and we have 
A™ == AG, The phase function corresponding to F is dH/@A™, 
This phase function will vary as the phase point moves along its path in 
_ phase space. An illustration of this situation is that of a volume of fluid 
‘in a container with rigid walls. Here the volume is the same, interpreted 

as either a mechanical or thermodynamic quantity. We see however that 
the force on the walls of the container arises from a large number of 
discrete impulses caused by the impacts on it of the particles of the fluid. 
The phase function corresponding to the pressure of the gas is the force 
per unit area averaged over the surface of the container. This quantity, 
which could be called the ‘microscopic pressure of the fluid’ will fluctuate 
with time unlike the corresponding thermodynamic pressure. 

(b) If the system is in mechanical equilibrium with its environment then 
the generalized force F‘” is determined by the environment and it is the 
mechanical extensive variable A“ which will fluctuate. This, for a fluid 
system, would correspond to the case where one wall of the container is 
replaced by a piston. The pressure of the fluid is now a constant, given 
by the force per unit area exerted from outside on the piston. The ‘micro- 
scopic volume’ of the fluid will now fluctuate as the piston vibrates under 
the influence of particle impacts. 

Since equilibrium fluctuations of dependent variables can, for physical 
systems, be experimentally detected {see e.g. MacDonald [1962]) the 
mechanical viewpoint here represents a positive advance over that of 
classical thermodynamics. 

There remains the problem: 

(P3) We need to know the equilibrium forms of the measure density 
function corresponding to particular thermodynamic conditions. 
There are for any particular system a variety of possible thermodynamic 
conditions. In this paper we shall consider only two for which problem 

P3 is solved in the following way: 


1 The precise form of c(N), which is designed to avoid the difficulties of Gibbs’s paradox 
and to give identity, i in the case of a perfect gas, with the corresponding formulae of 
quantum statistics, is determined by the physical system under investigation. We do 
not need an independent forfnula for the absolute temperature. Once the form for 
entropy is established the temperature arises from the thermodynamic procedures, 
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(S3) For an isolated system with energy E the phase point moves entirely 

on the energy surface Xp. The natural (uniform) measure on the 

* energy shell {(p, q):E—i4E < H(p, q) < E+44E} is preserved 

by the Hamiltonian flow. In the limit AE’ o a preserved measure, 

called the microcanonical distribution, is induced on 2’, in terms of 
which ; : 


, where í 
. dd p . 
o(E) = | le Vaa) 77° (9) - 
o E<o 


is the structure function of the system. 

For a system in thermal contact with its environment at absolute 
temperature T the measure density function is that of the canonical 
distribution which is given by 


_exp(—H(p, DT) (10) 
| ar exp (He, DRT) 


Use of the canonical distribution and the formulae (6) and (7) of S2 leads 
very simply (see e.g. Hill [1956]) to the fundamental thermodynamic 
relationship given above. We are therefore left with two problems. We 
need to justify 


(P4) The use of the measure density function to relate the variables of 
FP and S™, 
(P5) The equilibrium forms for the measure density function. 


These are the problems which nrust be tackled at levels 2 and 3. We 
should however note before passing to these levels that it is not essential 
to give independent justifications of the microcanonical and canonical 
distributions. A system in thermal equilibrium with its environment can 
be taken to be part A of an isolated system of which the remainder, part B, 
is taken to be the environment of A. If A and B are assumed to be only 
weakly interacting! then the canonical distribution can be derived by 
taking the asymptotic limit as B becomes infinite.? The basic problem is 
therefore the justification of the microcanonical distribution. 


p(P, 4) = 


1 The Hamiltonian of the total system is the sum of the Hamiltonians of A and ‘B. 

* Using the central limit theorem (Khinchin [1949]) or the method of steepest descents 
(Kubo [1965]) this can be shown on the basis of certain reasonable mathematical 
assumptions. 
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LEVEL 2 


At this level we discuss the ways in which probability theory is used as a 
means of justifying the use of a measure density function to relate the 
variables of S% and PtP, As can be seen from the diagram the ergodic 
method is alone in its attempt to relate equilibriuin*thermodynamics and 
the underlying mechanical structure without resort to this kind of justifica- 
tion. It will not therefore feature in our discussion until level three. The 
common feature of all the approaches described here is that, on some 
interpretation or another, the measure density function emerges as a 
probability density function. 

One of the difficulties of discussing the use of probability concepts in 
statistical mechanics is that, in much of the literature on the subject, these 
concepts are introduced in a rather informal manner. This is a tradition 
in statistical mechanics which dates from the works of Boltzmann 
([1896]) and Gibbs ([1902])+ and which persists to some extent until 
the present day. Apart therefore from those occasions when we are con- 
sidering a writer who makes specific efforts to state his views on probability 
theory? we shall need to be somewhat tentative in classifying authors into 
one camp or another. 


The Logical View 

This may be characterised by the claim that we can ‘cognise correctly a 
logical connection between one set of propositions which we call our 
evidence and which we suppose ourselves to know, and another ‘set which 
we call our conclusions, and to which we attach more or less weight 
according to the grounds supplied by the first. .. . It is not straining the 
use of words to speak of this as the relation of probability. . . . No proposi- 
tion is in itself either probable or improbable, ... and the probability of 
the same statement varies with the’evidence presented ...’ (Keynes 
[1921], quoted by Gillies [1973], p. 8). In the context of statistical mechanics 
it is perhaps more appropriate to refer to data or information rather than 
to evidence. A complete set of data for the mechanical system /‘ would 
allow us to solve the equations of motion. The rationale for the use of ` 
probability theory, in the logical tradition, is therefore the need to deal 
with a system for which he information is incomplete (see diagram).? On 


1 With respect to Boltzmann and Gibbs this is not surprising in view of the time at which 
they were writing. 

2 This is true of those authors like Jaynes and Hobson who use the information theory 
approach. 

3 The logical view is sometimes referred to as the ‘ignorance view’ although, as we shall 
see, the ignorance method at level 3 is only one of the possible methods which can adopt 
this view of probability, 
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the basis of a particular set of data we must derive in some way the ‘best’ 
probability density function for the system. This point of view features in 
the’ development of statistical mechanics in two ways. In the first of these 
it is closely linked to the justification of the microcanonical distribution. 
It will be recalled that the microcanonical distribution purports to be that 
which is appropriate to an isolated system. The’ information which we 
have about such a system consists of some value E for the total energy. The 
Keynesian Principle of Indifference leads to the use of a probability density 
function zero outside of, and uniform within, the energy shell {(p, q):E 
—}4E < H(p, q) <'E+44E} and from this we derive the micro- 
canonical distribution iri the limit AE > o. This would seem to be the , 
approach of Tolman ([1938]), although we must be rather careful in ' 
analysing his writings because of his use of the idea of ensembles (see ` 
below). A more recent and developed approach within the logical tradition 
is the ignorance method of Jaynes ([1957], [1965], [1967]) and Hobson 
([1971]). These writers adopt a specific principle by means of which, 
given a set of data, a unique (best) probability assignment can be derived. 
For Jaynes this is the Principle of Maximum Entropy. He identifies the 
concept of entropy as defined by Shannon in information theory’ (see 
Shannon and Weaver [1949]) with entropy in statistical mechanics; the 
appropriate probability distribution is the one which maximises this 
entropy relative to the given set of data and the thermodynamic entropy 
is equal to the maximum value of the statistical mechanical entropy. 
Hobson’s.formulation differs from this in detail but is ultimately equivalent. _ 
He derives an expression for the uncertainty in a probability distribution 
and adopts the principle (called by him Jaynes’ Principle) that the ap- 
propriate probability distribution is that which maximises the uncertainty 
relative to the given data. It is only at a later stage that identification is 
made between entropy and uncertainty. The advantage of Hobson’s 
formulation, at this point, as compared to that of Jaynes, is that it avoids _ 
the immediate (and tempting) conflation of the information theory and 
thermodynamic concepts of entropy. Apart from these slight differences 
- it is clear that, for both Jaynes and Hobson, the appropriate density for a 
system for which the datum consists simply in the restriction of the phase 
point to an energy shell is that given by the Principle of Indifference. 
Hobson claims (probably corréctly, although see the discussion below) 
that Tolman [1938] shares his view that statistical mechanics is the study 
of incompletely specified systems. He also argues that Gibbs rejected 
both the ergodic method and the viewpoint that statistical mechanics is 


1 The uniform distribution over the energy surface Æg is not preserved by the Hamiltonian 
flow. 
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the study of large systems and that he must therefore have been to some 
extent an early member of the same school. 
The Scientific View i 
This view may be characterised by the fact that it considers ‘the theory 
of probability as a science of the same order as geometry or theoretical 
mechanics’ (von Mises [1957], p. vii). In this tradition the probabalistic 
properties of the system under investigation emerge if we consider the 
results of a large number of identical! operations on the system. For von 
Mises (see e.g. von Mises [1957], p. 29) this large number of operations or 
_ events is the collective; the probability of a particular outcome is then 
- defined as the limit of the relative frequency of this outcome in the collective, 
as the size of the collective increases to infinity. For Popper on the other 
hand (see Popper [1959]) the collective, rather than providing a means of 
defining probability, serves to reveal the latent propensity of the system 
to behave in a certain way.? The idea of the collective features in statistical 
mechanics in two, more or less distinct, ways. In the first of these the 
‘behaviours of the individual microsystems, or particles, constitute the 
everits and the collective is the mechanical system as a whole. This is the 
point of view of von Mises, when he asserts (von Mises [1957], p. 182) 
that ‘A volume of gas containing a great number of molecules appears . . . 
as a system not different in principle from the automatic lottery machine. 
...'8 In this context we are concerned about assigning a probability density 
function to a single particle and the crucial characteristic of the system 
is that it has a large number of particles (or degrees of freedom). The 
second way in which the idea of the collective features in statistical 
mechanics is in the situation in which we are concerned about the 
properties, not of an individual particle, but of the whole system. We 
may, for example, be concerned with the total energy of the system. For 
a system of particles whose interaction is given in terms of a pairwise 
potential function, the total energy of the system is not reducable to the 
sum of individual particle energies and the von Mises viewpoint is in- 
applicable. The method here is to take the various possible behaviours - 
of the whole system as the events and to consider a collective of macro- 
scopically identical systems. This collective is called an ensemble. The 
idea of an ensemble of systems was introduced by Gibbs. His idea was 


1 Identical in the ‘macroscopic’ sense that repeated throws of the same die are identical. 

? There are of course, besides the relative frequency and propensity interpretations, other 
variants of the scientific view (see Gillies [1973]). 

3 The aùtomatic lottery machine is designed to mix the lottery slips by shaking in such a 
way as to make ‘it impossible to follow the fate of one of them. . . , (It is provided with) 
some mechanism which woukd eject a lot through a funnel after a sufficient period of 
shaking the container’ (von Mises [1957], p. 177). 
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that we imagine the phase space I to be filled with phase points of ‘a 
great number of independent systems, identical in nature, but differing 
in phase’ (Gibbs [1902], p. 5). Dividing the density of phase points by 
the number of members of the ensemble and taking the limit, in which the 
number of members of the ensemble and the density of phase points 
become large, their “qtotient remaining finite, produces a normalised 
density on phase space which is taken to be a probability density function. 
Although this way of producing the probability density function has a 
strong flavour of the von Mises relative frequency interpretation it is not 
entirely clear, from Gibbs’ discussion, that this was his intention. It is 
certainly true, as Hobson points out, that Gibbs believed that “Fhe laws 
of statistical mechanics apply to conservative systems of any number of ` 
degrees of freedom, ...’ (Gibbs [1902], p. viii). But we have seen that a 
scientific interpretation of probability, using an ensemble, is not restricted 
to systems with a large number of degrees of freedom. It is only the von 
Mises viewpoint, which takes the system itself as the collective, which is 
so restricted. In the case where the ensemble is the collective, it is the 
number of members of the ensemble which is large. It could however be - 
argued (and this I think is the essence of Hobson’s point) that if we are 
to introduce probability at all then it must be because exact mechanical 
calculations are not viable. Since it appears that this inviability arises, 
either because of the large number of degrees of freedom, or because the 
system is not completely specified, then Gibbs must have taken the latter 
view with the concomitant logical approach to probability. In this case the 
ensemble must be regarded as a heuristic aid to understanding. 

Since the time of Gibbs it has needed a considerable act of will on the 
part of teachers and researchers in statistical mechanics to avoid all 
references to ensembles, so much have they become part of the language 
of the subject. It is fairly clear that, for at least some writers of textbooks 
(e.g. Hill [1956]), the ensemble is a means of defining probability after the 
manner of von Mises. But in most cases its role is rather more confusing. 
Particularly interesting in this respect is the work of Tolman ([1938)). 
He begins his analysis by using an ensemble of systems in conjunction 
with the Principle of Indifference. He assumes that ‘the phase point for a 
given system (of the ensemble) is just as likely to be in one region of the 
phase space as in any other region of the same extent which corresponds 
equally well with what knowledge we do have as to the condition of the 
system’ (‘Tolman [1938], p. 60). This remark seems to support Hobson’s 
contention that his view of probability is the logical one. It must however 
be born in mind that an important aspect of the logical view is that prob- 
ability is a relationship between a set of data and*our knowledge of a single 
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system. The weighted average of a phase function taken with respect to 
the probability density function is the expectation value which reflects in 
some way our expectations about the system, based on our knowledge. 
For Tolman this average is the ensemble average. For him it provides 
results which ‘are to be regarded as true on the ayerage for the systems 
in an appropriately chosen ensemble rather than as necessarily true in any 
individual case’ (Tolman [1938], p. 63). This remark has an ‘objective’ 
scientific ring which accords ill with Keynes’ comments on probability 
quoted above. 


LEVEL 3 
*, The Ergodic Method 
Let y be a subset of the phase space J" which is invariant with respect to 
the Hamiltonian flow and let p be a time-independent density function on 
y which is preserved by the flow and for which y has total measure unity. 
It is supposed that X‘* is a phase function integrable over y. It is argued 
that a measurement of the thermodynamic quantity X‘” corresponds to 
the average Ñ (Po qo, 7) of XO over a period of time 7, computed along a 
path of the system beginning from the point (pp, qo) in y. It is assumed 
that 7 is long with respect to (a) the microscopic correlation time, (6) the 
relaxation time of macroscopic variables and (c) the time taken to destroy 
purely local constants of motion. On the basis of these assumptions it is 
further argued that the result of the measurement is effectively the infinite 
time average obtained by allowing + to tend to infinity. For this view of 
measurement to be meaningful it is of course necessary to show, not only 
that this time limit exists, but also that it is independent of the’ point 
(Po qo). Let X be the average of XUP over y with respect to the measure 
density function p. In terms of this the following results were proved by 
Birkhoff ([1931]):? . 

(i) lim (Po qo 7) = X(Po: qo) exists almost everywhere in y (i.e. 


except possibly for a set of measure zero with respect to p). 

(ï) X is a constant of motion almost everywhere in y. 

(ii) X= F= Ê. 
Let us put aside, for the moment, any doubts which arise from the existence 
of exceptional points and examine how these results affect the definition 
of measurement. It is clear that (a) if X is a constant almost everywhere 
in y then Ê = & and hence from (iii) X = X holds almost everywhere 
in y, (b) if X = X holds almost everywhere in y then X is a constant 
almost everywhere in y. If X = X holds almost everywhere in y, for all 


1 Birkhoff’s paper contains an éxplicit prof only of (i). Results (#) and (ii) are trivial 
consequences of the main result. 
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phase functions integrable over y, then the system is said to be ergodic. 
We see therefore that, for ergodic systems, we may (almost) legitimately 


write 

‘ Xr — X (11) 
which is the relationship (6) between X‘” and X“ proposed in Sz level 
4, with p identically zero at all points in I‘ not belbnging to y. 

It would be tempting to suppose that ergodicity has removed the 
statistics from statistical mechanics by establishing the proposed relation- 
ship between XOP and X‘”, with p as an uninterpreted measure density 
function used simplyas a tool to calculate time averages. The difficulty in 
accepting this point of view arises if we consider what we are doing by 
neglecting the set of exceptional points. We are supposing that a measure- - 
ment is never (or hardly ever) made on a system which at the beginning 
of the measurement has phase point (Pp, qo) belonging to that set of points 
of y for which the infinite time average does not exist. In supposing that 
such a situation never in practice occurs we have tacitly assumed some 
probabalistic interpretation for p in terms of which sets of zero measure 
with respect to p have zero probability. Nevertheless, and accepting this 
slight obeisance in the direction of an interpreted probability, it would 
still seem to be useful to prove the ergodicity of a system. 

The original ergodic hypothesis (usually attributed to Boltzmann, but 
see Brush [1964]) assumed that the path of the system passed through 
every point of y. Since X is a constant of motion it is clear that this hypo- 
thesis is sufficient to establish that £ is a constant on y. It is equally clear, 
on measure theoretic grounds, that the ergodic hypothesis is false. An 
alternative quasi-ergodic hypothesis, to the effect that the path of the system 
passes arbitrarily close to every point of y, has not proved sufficient to 
establish ergodicity, although it is certainly necessary. There is however 
a condition, both necessary and sufficient, which is intuitively somewhat 
similar to the quasi-ergodic hypothesis. To prove the necessity of the 
quasi-ergodic hypothesis we would assume that, given a particular path / 
of the system, there exists a point P of y for which a neighbourhood e(P) 
of P contains no points of Z. This is clearly impossible for an ergodic system 
since we could arbitrarily change the value of the phase function on «(P), 


1 The path of the system is of measure zero in y. A straightforward proof that the ergodic 
hypothesis is false is as follows. Let P and P’ be two points of y. Since the path of the 
system has a unique tangent direction at every point there are at most two arcs of the 
path joining P to P’. Let o be an arc joining P to P’ which is neither of the arcs of the 
path of the system. Now the points at which the path of the system intersects o (if it 
does so) can be ordered with respect to their values of ¢ and are denumerable. But for 
the ergodic hypothesis to be true the path of the system would have to pass through all 
the non-denumerable set of points of o. (This is a similar argument to one used by the 
Ehrenfests ([1912], English edition 1959, footnote 98) tô explain the difference between 
the ergodic and quasi-ergodic hypotheses.) 


The Role of Statistical Mechanics in Classical Physics 269 


thereby changing the phase average but not the time average. The even 
stronger assumption, that y can be decomposed into two invariant subsets 
of non-zero measure with respect to p, is clearly inconsistent with ergodicity 
by a slight modification of the same arguments Metric transitivity, which 
is defined to be the negation of this assumption, ig therefore a necessary 
condition for ergodicity. That it is also a sufficient condition can be seen 
quite simply by the following argument. Suppose that X were not constant 
almost everywhere on y, then y can be decomposed into two invariant 
subsets of non-zero measure with respect to p according to the values of 
& and therefore it cannot be metrically transitive. * 

_ For the stationary Hamiltonian system £9, the Hamiltonian is an 
’ isolating constant of motion and the energy surface 2’, is an invariant 
subset of I’. If 2’, is metrically transitive then the system is ergodic on 
2, and the microcanonical distribution, given in S3 level 4, is proved 
(with the reservations described above with respect to exceptional points) 
by the ergodic method. From this the canonical distribution can be 
derived and the links between JOP and SP are established. 

Suppose that, besides the Hamiltonian, there exist other independent 
isolating constants of motion. In this case it is clear! that 2’, cannot be 
metrically transitive and the development of statistical mechanics outlined 
above is invalid. Only in the case of a system of more than two hard spheres, 
contained within a parallelepiped box with perfectly elastic walls, has this 
been proved rigorously not to be so (Sinai [1967]).* In general the problems 
of proving the non-existence of additional isolating constants.of motion 
is very difficult. It would therefore seem reasonable to investigate the 
consequences of their existence. This has been done by Lewis [1960]. 
Beginning with the set of hypersurfaces, which represent almost every- 
where all isolating constants of motion, he showed that the system is 
ergodic on the set obtained as the intersection of vanishingly small shells 
defined with respect to each hypersurface, the measure being the uniform 
measure in phase space. On the basis of this it is possible to obtain generali- 
sations of the microcanonical and canonical distributions. Unfortunately 
the form of thermodynamics developed from these distributions differs 
significantly from the standard form (Grad [1952]). In particular we have, 
rather than one parameter representing temperature, a whole range of 
parameters, one for each of the isolating ‘constants of motion. 

To circumvent the difficulties associated with metric transitivity and 
the possible existence of additional isolating constants of motion, attempts 
have been made to weaken the requirements of ergodicity in such a way 
1 Any other isolating constant of motion will divide the energy surface into two invariant 


subsets each of non-zero measure. 
2 See also the discussion of Sinai’s results at the end of Farquhar [1967]. 
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as still to satisfy the needs of statistical mechanics. It is not difficult to 
show (see e.g. Farquhar [1964]) that, for the measure density function p 
and the invariant subset y, and for any e > o, , 


> e] < 1/e Variance al (12) 


where the variance is calculated using the measure density function p. 
In particular this result applies to the microcanonical measure over the 
energy surface 2',. In order for this result to be of any use for our purpose 
it is necessary to show that the right-hand side of equation (12) is small. 
This is certainly not the case for arbitrary phase functions and systems 
with any number of degrees of freedom. As we have already argued, the . 
reason for the introduction of methods, other than direct integration, ` 
for dealing with the system #4 is, either that the system has a large 
number of degrees of freedom, or that it is incompletely specified. Since 
the assumption of incomplete specification seems to lead inevitably to 
statistical methods, it seems reasonable to see the use of the ergodic method 
as a consequence of the system having a large number of degrees of freedom. 
Up to this point in our discussion of the ergodic method this feature has 
played no réle. But even in the case of systems with a large number of 
degrees of freedom it is not necessarily the case that the variance of 
X40 /X is small for all XOP, Khinchin [1949] was however able to show 
that if the Hamiltonian H and the phase function X“ are both sums of 
contributions from the individual microsystems (sum functions) then the 
variance of X‘49/X is of the order of 1/N. The unwanted restriction 
represented by the Hamiltonian having to be a sum function has since 
been removed by Mazur and van der Linden [1963], who considered the 
asymptotic limit for N of sum functions for systems of particles interacting 
through short-range hard-core interactions. Their argument requires an 
assumption that the zeros of a cerfain polynomial related to the con- 
figurational integral of the system are not dense in intervals of the real axis. 
The difficulty again encountered with this approach is that we have in 
some way to know that sets of small measure with respect to p have small 
probability of occurrence. 


Measure le. q): |49 —I 





The Evolution Method 

Once some probability interpretation has been given to the measure 
density function p, the measure M[y,] of any subset y; of I’, at time t, will 
be the probability of the phase point of the system being in y, at that time. 
It is then clear why measure must be preserved by the Hamiltonian flow 
since, if the set y, evolves into the set yy at time 2’, the phase point will be 
in yy at time t’ if and only if it is in y; at time ¢. The probability density 
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function must therefore satisfy Liouville’s equation. We have defined 
equilibrium for the thermodynamic system ZP as the state into which 
S™ evolves if left undisturbed for a sufficiently long period of time. The 
most natural method of obtaining the equilibrium distribution would 
therefore seem to be to investigate the solution of Liouville’s equation in 
the limit as ¢ tends to infinity. This is part of the programme of the Brussels 
School of statistical mechanics (see e.g. Prigogine [1970], Balescu [1971]). 
These researchers trace the evolution of the Fourier coefficients of the 
probability density function in terms of the destruction and creation of 
interparticle correlations. They have succeeded in showing (at least for a 
certain class of interparticle interactions) that; in the thermodynamic 
". limit, the dependence of the probability density function on initial correla- 
tions decays in the infinite time limit. In this time limit the probability 
density evolves to the canonical form. They would also claim to have 
resolved the recurrence paradox of Zermelo ([1896]) and the reversibility 
paradox of Loschmidt ([1876], [1877]). These paradoxes arise by counter- 
posing, on the one hand the evolution to equilibrium of a thermodynamic 
system, and on the other the quasi-periodic and time-symmetric behaviour 
of a‘finite mechanical system. The recurrence problem is resolved by 
taking the thermodynamic limit in which the recurrence time becomes 
infinite. The reversibility problem is resolved by observing that, although 
the initial correlations can be neglected at any stage in the evolution of the 
system in the positive time direction, such a process is inadmissible if 
time is reversed because in this direction the correlations remaining from 
the initial state play an increasingly important rôle in the process by 
which the system is returned to its initial state. i 

These brief comments are not intended to do full justice to this approach 
to the solution of the basic problems of statistical mechanics. Particularly 
so since the evolution method is essentially an approach to the whole 
problem of the relationship between equilibrium and non-equilibrium 
situations, this being an area still plagued by fundamental disagreements. 
One thing however should be emphasised. The evolution method is not 
tied to any particular philosophy of probability. It needs only some 
interpretation of p as a probability density function in order to give it 
validity. 


The Ignorance Method 

This method, based as it is on a logical view of probability, relies on a 
principle by means of which, given a set of data, the appropriate equilibrium 
probability density function can be derived. We shall adopt the formulation 
of Hobson [1971]. As we have already indicated at level 2, he derives an 


Rel, 
a . 
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expression for the uncertainty in a probability distribution and adopts the 
principle that the appropriate distribution is that which maximises the 
uncertainty relative to the given data. He shows that, if we make certain 
reasonable assumptions regarding the nature of information measurement, 
and if we suppose that,the most accurate measurement on the system is 
capable of locating the phase point in a cell of measure c(N) then there is a 
unique formula for uncertainty given by 


So) = — | ar o(p, a) 1 (oN), 4 (13) 


The uncertainty is equated with the entropy, given in S2 level 4 equation 
(7), when p has been adjusted to maximise S(p) with respect to the restric- , 
tions imposed on it by the available data. i 

Within the context of this,approach, the purpose of statistical mechanics 
is to predict values for the results of measurements. The thermodynamic 
variable X‘” is taken to be the best prediction of a result of a measure- 
ment of the corresponding phase function X‘?. Since however, from 
Tchebyshev’s inequality for any « > o, 


a XD XD 
Probability | xX > e] < 1/e* Variance esa -(14) 


where <X) is the expectation value of X, it is argued that <X> is the 
best prediction for X9. The right-hand side of the relationship (6) in 
S2 level 4 is now, with p interpreted as a probability density function, 
the expectation value of X‘ and this relationship is given a justification. 
It is further argued that the variance of X“/<X) is of the order of 1/N 
and thus that the predictions made will increase in accuracy as N increases. 
Although this can be established on physical grounds for certain phase 
functions! a general statement of this kind would seem to rely on the kind 
of result established by Khinchin and quoted, with respect to our dis- 
cussion of equation (12), at the end-of the section devoted to the ergodic 
method. 

The great advantage, from a teaching point of view, of this method is 
that we are able to establish the microcanonical and canonical distributions 
with the minimum of mathematical manipulation. All we need is some 
simple applications of the method of undetermined multipliers. Its- 
weakness is that it makes no „proper connection between the idea of 
information and our detailed knowledge of the physical circumstances of 
the system under investigation. This weakness is well illustrated by the 
remark of Hobson ([1971] p. 84) to the effect that ‘the microcanonical 


I 





1 In the case where the phase function is the Hamiltonian the variance of H] <H} is equal 
to RCa(T/U)*, where Ca is the heat capacityeof the system at constant 4), and both 
Ca and U are of the order of N. 
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distribution is generally regarded: as being applicable to closed systems 
(i.e. systems having time-independent Hamiltonians), while the canonical 
distribution is regarded‘ as being applicable to systems in weak interaction 
with a second system, where the interaction is time-dependent but random 
(i.e. not precisely known)’. The point is that phygical considerations of 
the type referred to in this quotation, must be built into the information 
before the distributions are derived. If this is not done certain paradoxical 
situations can arise. One such is contained in the following example? where 
for the sake of simplicity we consider a system with discrete energy levels: 


Giyen a system with non-degenerate energy levels E,<E,<.... Suppose 
that we are first given the datum D, that the energy lies in the closed 
interval [F,, En]. Then from Jaynes’ principle the appropriate distribution 
is the discrete form of the microcanonical distribution 
ijn 1<j<a 
Prob [E,|D]= (15) 
o otherwise - 
Suppose now, to quote Hobson ([1971], p. 107) ‘the energy E be measured 
and let the datum be (E>=U, where U is given’. Referring to this new 


‘piece of datum as D’ and using Jaynes’ principle we now have the discrete 
form of the canonical distribution 





e (—EsP)/Z(8) 1<j<n 
Prob [E;|D and D]= (16) 
o otherwise 
where 
2(6) = Š exp (~EB) E 
and 
_ @ln Z() 
— g ap (18) 
Now 
Prob [E;|D]= > Prob [E;|D and D'] Prob [D'|D] (19) 
D 


and since D’ varies over all values of £, which may be supposed to have a 
density function p(£), we have 


z= | apne) PEEP rcjen (20) 


But for j==1 and all p(8) except p(8)=5,(8) 


exp (—£,8) exp (~p) _1 
[en E> | apo) E 2 tan 
‘ k Ta example was proposed by A. Shimony during a seminar at Chelsea College, 
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As indicated above, a way to resolve this paradoxical situation would be 
to divide information into two types: (a) The results of measurements. 
We'may, for example, know the energy of the system or the temperature 
of a heat-bath in which the system is immersed. These are the kinds of 
data with respect to which uncertainty has been maximised. (6) General 
observations of a non-quantitative kind concerning the system. We may, 
for example, observe that the system is, as far as possible, thermally 
isolated, or we may observe that steps have been taken to ensure, as far as 
possible, that it is in thermal contact with its environment. In the above 
example D will properly consist, not simply of data concerning the range 
of permitted energy levels, but it will also tell us that an attempt has been f 
made to isolate the system thermally, in which case AE = E,—E£, should - 
be small. Again D’ will consist, not simply of the piece of datum ¢<E> = U, 
but will also include the information that the system is in thermal equi- 
librium with its environment. Thus the total information (D and D’) will 
contain a contradiction and, if we wish to accept D’ then D must be 
rejected as mistaken, invalidating the use of the conditional probability 
formula (x9). 

While this point may seem relatively trivial, and by no means to indicate 
a general failure of the method as a whole, it does need to be clarified in 
the writings of those who use the ignorance method as their approach to 
statistical mechanics. 


The Ensemble Method 
As we have already indicated in our discussion at level 2, the use (at least 
verbally) of ensembles in statistical mechanics is both nearly universal 
and shrouded with a certain amount of ambiguity. As a method of obtaining 
the microcanonical and canonical distributions it does however have the 
virtue of great simplicity particularly in the case of systems with discrete 
energy levels (see e.g. Rushbrooke [1949]). In fact the mathematical 
procedure is almost identical to that used in the ignorance method. The 
difference between these two methods arises when we come to interpret 
what in the ignorance method is the expectation value, featuring as the 
best prediction about a single system, and what in the ensemble method is 
the ensemble average representing a property, not of a single system, but 
of the ensemble as a whole. One apparent advantage enjoyed by the en- 
semble method in this connection (over both the ignorance and ergodic 
methods) is in its treatment of the problem of the possible existence of 
isolating constants of motion in addition to the Hamiltonian. ' 

We have seen at level 4 that once a upiform distribution over an energy 
shell has been assumed the microcanonical and canonical distributions 
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can be rigorously” derived. For a system for which only the energy is 
known the Principle of Indifference provides the required distribution. 
The impasse of the possible existence of additional isolating constants of 
motion, which brings the ergodic method to a halt, is here resolved in a 
very simple way. If we do not know the values of these constants (whether 
we suspect their existence or not) we ignore them because we no longer 
expect one single system to be prerie modelled by the ensemble 
behaviour. 

The difficulty with this argument is that the only aa to test predictions 
arising from statistical: mechanics is to make measurements on an actual 
system. Now suppose that the phase function X“*? is an isolating constant 
*. of motion independent of the Hamiltonian. The ensemble average will 
predict the value for X‘, the thermodynamic quantity corresponding to 
XD, by taking an average over all the possible values for X“°, whereas 
for any one system the value of X‘*? will be a constant which may differ 
in any possible way from the ensemble average. To correctly model the 
behaviour of the system in question we should choose a subensemble 
with both H and xX‘ fixed which brings us back to the problem of 
determining additional constants of motion. If this is not done it is difficult 
to see how ensemble averages can be related to experimental measure- 
ments. The problem becomes even worse in the case of entropy which is 
directly related to the properties of the whole ensemble through our 
choice of a distribution for phase points of the ensemble. To refer to the 
equations resulting from statistical mechanics as ‘analogues’ of the corres- 
ponding thermodynamic equations (as do both Gibbs, ([1902], chapter 14) 
and Tolman ([1938], chapter 13)) serves simply to confuse the issues. 


FINAL DISCUSSION 

T have in this paper tried to distinguish the possible strategies in terms of 
which attempts have been made to base equilibrium thermodynamics on 
the structure of the underlying mechanical system. The four headings at 
level 3 (see diagram) do, as far as I can see, represent the four existing 
methods. At the present stage of statistical mechanics it would seem to 
be impossible to make a definitive choice between these methods except 
on philosophical grounds. (Personal adherence to one or other view of 
probability theory would lead to outright’ rejection of one or other of the 
methods.) Within its own terms of reference each of these methods has 
its strengths and weaknesses and it is the aim of this discussion to 
1 For a system with discrete energy levels the entropy is equal to Ee (kin Q)/n, where 


n is the number of members ofthe ensemble and Q is the number ofw ways of distributing 
these members over the energy levels. 


8 
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summarise these, giving, where possible, some indications of how the 
weaknesses could be overcome. 

The main strength of the ensemble method lies in its mathematical 
simplicity. It also gives at least the appearance of conceptual simplicity, 
especially if based on a scientific view of probability with the ensemble 
interpreted as a collective in the sense of von Mises. In this way the fact 
that the probability density function and the ensemble averages are 
properties of the ensemble is not essentially different from the view of 
von Mises ([1957], p. 12) that probability means ‘the probability of 
encountering a certain attribute in a given collective’, The situation seems 
more difficult when we consider the entropy. This quantity is a property 
of the ensemble related to no phase function at the mechanical level of a.” 
particular system (compare equations (6) and (7)). It therefore cannot, 
even in principle, be envisaged as the result of a sequence of repeated 
measurements. It may be possible to overcome this difficulty by some 
change of philosophical position on probability (using for example Popper’s 
idea of propensity in which probability is more closely related to an in- 
dividual system) but I know of no discussion of this point. 

For the ignorance method the probability density function is calculated 
by maximising the uncertainty relative to a particular set of given data. 
On the level of formal calculation this again gives a very simple method of 
obtaining the probability density function appropriate to the information 
that we have about the system. As we have seen there is a difficulty, well 
illustrated by Shimony’s problem, if information is too narrowly inter- 
preted. It does however seem possible to overcome this if we are prepared 
to include more general non-quantitative observations as part of the known 
data. A more important objection to this formulation relates to the depend- 
ence of the probability density function on the state of knowledge of the 
observer. This in turn means that the entropy of the system is dependent 
on this state of knowledge. This seeins to imply an anthropomorphic view 
of entropy. Jaynes ([1965]) is fully prepared to accept this ‘not only in the 
well-known statistical sense that it measures the extent of human ignorance 
as to the microstate. Even at the purely phenomenological level, entropy 
is an anthropomorphic concept. For it is a property, not of the physical 
system, but of the particular experiments you or I choose to perform on 
it’. The argument given by Jaynes for coming to this conclusion seem to 
rely on the observation that, whenever a measurement is made of the 
entropy of a system, only a subset of the possible variables of the system 
is taken into account. There will always be other thermodynamic variables 
which are implicitly assumed to remain constant. If we make an attempt 
to list all the variables of the system dnd to obtain a general formula for 
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the dependence ofthe entropy on the complete set then the task is, prac- 
tically speaking, impossible. The entropy of the system will therefore be 
dependent on just those variables whose existence we choose to acknow- 
ledge and will thus be a function of the experiments we decide to perform. 
This argument seems to mix together two quite different problems: (a) 
the calculation of the entropy of a theoretical model, apd (b) the corres- 
pondence between the entropy of a theoretical model and the measured 
entropy of a real system. We surely wish, for example, to say that the 
Sackur-Tetrode equation is an ‘objective’ formula for the entropy of a 
perfect gas in isothermal contact with its environment. The fact that no 
real systèm is a perfect gas is a distinct issue. In-spite of these comments 
“it must be admitted that the ignorance method seems, at least potentially, 
to be a consistent solution to the basic problem of relating mechanics and 
thermodynamics (not only at equilibrium būt also in non-equilibrium 
situations (see Hobson [1971])). If we are inclined to reject it, it will 
ultimately be because we are not prepared to subscribe to this ‘subjective’! 
view of classical physics. 

The important common factor shared by the ergodic and evolution 
methods is their attempt to take seriously the dynamic character of the 
underlying mechanical model. Some difficulties of the ergodic method have 
been described. Briefly these are the problem of actually proving that a 
system is ergodic on the energy hypersurface (ergodicity on an invariant 
subset arising from the presence of other isolating constants of motion is 
not satisfactory for statistical mechanics) and the need to attach ‘almost 
everywhere’ conditions to the theorems which have been proved. With 
respect to the latter difficulty we find ourselves in the trap of having to 
assign zero probability to sets of measure zero in order to explain the 
prevalence of thermodynamic behaviour in real systems. This problem 
arises again in the context of equilibrium fluctuations. It is certainly 
possible, once a measure on phase space has been derived, to calculate the 
variance of phase functions. We need however to know that this variance 
gives us some information about the spread of experimental measurements 
about their mean. For this to be the case it is necessary that, in some sense, 
the measure function should be construed as representing the probability 
of the results of measurements. Otherwise the credit accorded to statistical 
mechanics for its prediction of experimentally detectable equilibrium 


1 ‘Subjective’ is the term used by Jaynes to describe his philosophical position on prob- 
ability. Any false impression which may be gained from this terminology is corrected 
by Hobson who draws a clear distinction between the subjective (degrees of belief) 
interpretation of Ramsey and de Finetti and his own position (and that of Jaynes) 
which can be characterised by.the quotation from Keynes given above (see also p. 257, 
n. 1). 
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fluctuations will be lost. The strength of the evolution method is that it 
makes a direct attack on the problem of the approach to equilibrium, 
derfving the equilibrium distribution as the long time limit of the solution 
of Liouville’s equation. [ts weaknesses are largely mathematical. By 
using, as it does, varigus types of perturbation expansion it becomes 
involved with uncontrolled approximations. It is difficult to obtain a clear 
insight into the significance of neglecting higher order terms in these 
expansions and it is virtually impossible to know whether their contribu- 
tions to the evolution of the system are significant. 

In conclusion it is worth emphasising that the purpose of this paper has 
been to attempt a ratiorfal reconstruction of current programmes for the 
foundations of statistical mechanics. It may be that some entirely novel - 
approach, not envisaged here, will be developed in the future. In view of 
the successes of statistical mechanics at a practical level, it is difficult to 
believe that there will not come a time when the difficulties discussed here 
will be resolved. My own belief, for what it is worth, is that this resolution 
will come about from an approach which studies in detail the dynamic 
properties of the system. By this I do not of course mean that we should 
revert to attempts to solve the equations of motion. Apart from the prac- 
tical impossibility of doing this, such a programme would contribute little 
to our understanding of the criteria for the class of mechanical systems 
giving rise to thermodynamic behaviour. It does however seem possible 
that a study of the qualitative properties of systems (see e.g. Abraham 
[1967], Arnold and Avez [1968]) will be able to give insights in terms of 
which the foundations of statistical mechanics can be better understood. 


i Chelsea College, University of London 
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Discussion . 


FURTHER DISCUSSION ON SPLIT-BRAINS AND HEMISPHERIC 
CAPABILITIES 


Beginning with the well-informed review by Puccetti (Puccetti [1973]) there 
has been considerable discussion in this Yournal on complete cerebral commis- 
surotomy (CCC) and its importance for believing in-a duality of brains, minds 
and/or persons. My interest in the philosophical implications exceeds my 


` philosophical expertise. But I have had a considerable experience with the 


patients under discussion, so I write mainly to pgint out a few neurologic mis- 
conceptions, as they appear in the paper of D. N. Robinson (Robinson [1976]). 
One cannot philosophise constructively without first being clear about the facts. 

Robinson (at the bottom of p. 74) refers to “... the paradoxical ability of a 
patient to recognise familiar objects held in the hand but not to recognise the 
same object when presented for visual inspection ...’. This is the classical 
description of a so-called ‘visual agnosia’ but does not at all describe a CCC 
patient. The patient with CCC invariably names familiar objects when presented 
for visual inspection, or describes them accurately in the event that the objects’ 
names are not known; this is in contrast to neither naming nor describing an 
object when held in the left hand (of the right-handed patient). A CCC patient 
may occasionally guess the nature (and hence the name) of an object in the left 
hand when helped by certain minimal clues such as temperature, or a nociceptive 
(pain-causing) property of an object. Such clever guessing is often wrong, and 
when it is wrong the patient nevertheless with the same (left) hand correctly 
retrieves the misnamed article from an unseen collection. This sort of dissociation 
between conversational report and left hand behaviour has made a great impres- 
sion on everyone who has actually seen it (and there are many). 

I am not an expert on certain matters discussed by Robinson, such as back- 
ward masking. But the view expressed by ‘Robinson (in the absence, apparently, 
of any personal experience with the patients under consideration) that the CCC 
syndrome is more like than it is unlike various psychiatric and neurologic 
disturbances of which he lists a goodly number, is quite at variance with my 
own experience as a clinician for twenty years, and is also at variance with the 
impressions of the many other neurologists, neurosurgeons and psychologists 
who have seen these patients at first hand (Bogen and Vogel [1975]). 

Robinson refers (on p. 77) to ‘the blind-folded patient’s ability to point’ with 
the left hand to an object (just previously held in that hand) in spite of the fact 
that it cannot be named. It’s important to understand that correct pointing 
(as distinguished from tactual selection) can only occur when the patient is 
able to see the collection of objects, of which one is to be pointed out. In other 
words, the blindfold must be removed after the test object has been removed 
from the patient’s hand. After restogation of vision, when asked to point the 
patient will usually point to the correct object with the left hand. And then 
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having pointed (but not before) the patient can name the object or, if requested, 
point to it with the right hand. 

This sort of lateralised dissociation does not depend upon the presence or 
absence of language. For example, the right hand-(or manus, if you prefer) of 
a split-brain monkey cannot*retrieve an object just recently presented to the left 
manus (and vice versa), Rhe rat which is trained with information restricted to 
one hemisphere (by cortical spreading depression) cannot do the just learned 
problem with the untrained hemisphere. When discussing these, and a host of 
other data, workers in the field spontaneously talk about what a hemisphere is 
doing or has done. 

If Robinson would like to know what one hemisphere can do (which is a 
great deal indeed), he might benefit from my recent-review (Bogen [1974]). And 
he should certainly avail himself of an opportunity to see humans who have only 


one hemisphere, of whom there are a good many (Smith [1974]). Such individuals , 


provide an appropriate opportunity for anyone who desires to show the in- 
applicability of the two-braire view for any particular phenomenon of ‘mental 
duality’ (such as backward masking), because the demonstration of such a 
‘duality’ in hemispherectomised persons would be clear-cut evidence of the 
irrelevancy, for that phenomenon, of having two hemispheres. Those of us who 
have had considerable experience with both commissurotomised and hemi- 
spherectomised individuals can assure Robinson that the more striking features 
of the CCC syndrome do not and have never appeared in a hemispherectomised 
individual. The two-brain view is required to explain the CCC syndrome, even 
if such a view is not necessary for or even relevant to other examples of simul- 
taneous awareness and unawareness. In fact, having come to the two-brain 
view because of the CCC syndrome, many of us are inclined to suspect that 
various other phenomena of ‘mental duality’ might have this particular physio- 
logic basis. In our view, each supposed case of mental duality must be tested 
individually to see if it has its basis in (requires) the duality of the cerebral 
hemispheres. 

Robinson implies (p. 77, third paragraph) that no reasonable person (he says, 
‘we’) would attribute to a hemisphere such properties as a percept, memory or a 
judgment. He thereby excludes (from the set of reasonable persons) almost all 
of those investigators who have had direct contact with the data. 

The simplest, most straightforward way to talk about the facts is to speak of 
what a hemisphere can or cannot do, might do, or has just done. So far as the 
syndrome following CCC is concerned, it is a plain (and surely relevant) fact 
that everyone who has worked with these patients (or the related animal experi- 
ments) for any length of time has eventually adopted this usage no matter what 
his or her original resistance may have been. This usage seems to be independent 
of the species studied as well as of the ideologic, philosophic or stylistic pre- 
dilection of those who have had the relevant experiences. Among Communist 
scientists, attribution of mental states is usually eschewed with respect to any 
experimental subject; but they do speak of the capacities of individual hemi- 
spheres, as when they refer to ‘... the activating influence of one brain-half on 
the other .. .’ (Mosidze et al. [1971], p. 768). . 

With respect to learning, they say: 


In split-brain dogs, unlike their normal cagemates, the temporary 


. 
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connections underlying subtler auditory discriminations appeared not to 
have been established in the hemisphere ipsilateral to the ear stimulated 
(Pp. 769). : 


Consider next another group of Communist scientists, famous for their 
studies in the laboratory rat (Bureš et al. [1974]): 


Thus the first conclusion derived from the reversible split-brain studies 
is that unilateral engrams formed during learning with one hemisphere 
do not spontaneously spread during the learning-retrieval interval to the 
untrained hemisphere (p. 271). 


In the words of an internationally respected American scientist: 


It is now well known that the forebrain commissures allow sensory 
discriminations learned by one hemisphere to be performed by the other 
(Sullivan and Hamilton [1974]). 


Another investigator, who has had experience with rats, cats, monkeys and 
chimpanzees, dating from the very inception in the early 19508 of the split-brain 
experiments, has expounded: 


The essence or meaning of sensory experiences achieved through the 
stimulation of those peripheral sensory receptors projecting to one hemi- 
‘sphere are made available to the analytic mechanisms residing within the 
other hemisphere through the commissural connections of the corpus 
callosum [and] these trace systems of the two hemispheres thereafter 
possess the potential of separate existence, as evidenced by their continued 
expression subsequent to total section of the commissure [although] for 
the cat using vision, the effects of direct sensory experience upon one 
hemisphere are markedly more effective in memory development than 
are the effects of transcommissural transmission of a conflicting task 
(Ebner and Myers [1962]; Myers [1970!). 


Doty, a recent past president of the Society for Neuroscience summarised his 
views by saying, 

. the callosal system: (a) enables each hemisphere to have access to 

memory traces stored in other, and (b) controls the formation of engrams 


in such a way that they are layed down only in a single hemisphere (Doty 
and Overman [1976]). 


And we note the following phraseology in a discussion of some striking 
split-brain experiments in monkeys: 
. . when both hemispheres know a problem and each is free to respond, 
one side usually controls responses (Gazzaniga [1971]). 
And the foregoing typical observation is subject to the fact that 


. the probability that a hemisphere will control a response is directly 
related to its history of obtaining reinforcers. The hemisphere which is 
most successful in earning reinfprcement comes to dominate (Johnson and 
Gazzaniga [1971], p- 708). 


T 
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In the same vein are the opinions of eminent Hollanders discussing their 
experiments in monkeys: 


Our findings are in striking agreement with those obtained in human 
patients . . . that each half of the brain may steer independent movements of 
arm, hand, and fingers contralaterally, but mainly arm movements ipsi- 
laterally [keeping {n*mind] that the limitation of the ipsilateral control 
mainly pertains to relatively independent distal movements of the extremities 
[which is] easily masked by tactile guidance of the distal extremity move- 
ments through the nonseeing hemisphere (Brinkman and Kuypers [1972]). 


An Italian scientist, fortified by a long experience with cats, has also done 
notable experiments with intact (unoperated) humans from which he concluded, 


... the vertical, horizontal, and intermediate orientations are perceived _ 
and responded to faster by the left hemisphere exactly because these 
orientations are easily anglyzed and categorized in verbal terms [whereas 
in other conditions] the superior ability of the right hemisphere in analyzing 
spatial relationships would emerge and reverse the visual field asymmetry 
(Berlucchi [1975]). 


Next consider the opinion of prominent investigators of the CCC human: 


... when a hemisphere is intrinsically better equipped to handle some 
task, it is also easier for that hemisphere to dominate the motor pathways 
(Levy, Nebes, and Sperry [1971]). 

Lest one suppose that the immediately foregoing usage is simply ascribable 
to the influence of a famous professor on his young students, consider the words 
of a mature, independently renowned scientist trained in a British tradition of 
understatement: 


Had our studies stopped at this point, we might have concluded that 
tactile pattern recognition is solely a function of the right hemisphere 
[but other data suggest either] that the separation of the hemispheres 
reduces their functional efficiency or that the left hemisphere normally 
participates in tasks of this kind, perhaps by adding some useful verbal tag 
once the pattern has been accufately perceived by the right hemisphere 
(Milner [1974], P- 78). 


Such manner of speech also appears in the same article with reference to the 
results of lateralised lesions: 


It looks then as though the early left-hemisphere lesion leaves the right 
hemisphere free to develop its innate language capacities, whereas, in the 
intact brain, these capacities are actively suppressed by the left hemisphere 
at some critical stage in early development (Milner [1974], p. 85). 


From another Commonwealth scientist we have: 
Each of the disconnected hemispheres is capable of perceiving the 
identity of a familiar object or a representative of a familiar class of objects 


..+ But only the left hemisphere cap generate speech ... Very rarely, a 
fragmentary spoken or written response may be initiated by the right 
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hemisphere after hemispheric deconnection, but such responses are rapidly 
overridden by the left hemisphere, possibly in reaction to the commence- 
ment of verbal report (‘Trevarthen [1974]). 


A recent splendid review by Sperry includes the following paragraph: 


Although some authorities have been reluctant tò credit the disconnected 
minor hemisphere even with being conscious, it is our own interpretation 
based on a large number and variety of nonverbal tests, that the minor 
hemisphere is indeed a conscious system in its-own right, perceiving, 
thinking, remembering, reasoning, willing, and emoting, all at a charac- 
teristically human level, and that both the left and the right hemisphere 
may be conscious simultaneously in different, even in mutually conflicting, 
mental experiences that run along in parallel (Sperry [1974]). 


SUMMARY á 


In his opening paragraph Robinson averrs, “These patients can identify palpably 
but not visually those otherwise familiar objects presented to the left hand.’ 
But they do identify the objects visually! What he should have written is, “These 
patients can identify manually (by correct retrieval or appropriate manipulation) 
but not verbally, those otherwise familiar objects presented to the left hand.’ 
His'misunderstanding of these and related facts likely contributes to his rejection 
of the following conclusions: 


(1) There is an abundance of evidence (of which hemispherectomy is the most 
obvious) that one-half of the cerebrum (if it is a single hemisphere) can 
subserve the functions of a mind. Corresponding conclusions cannot be 
supported by removing the top, bottom, back, or front half of the cerebrum, 
rather than either the right half or the left half. 

(2) The split-brain phenomena (in many different species) show that, an in- 
dividual with two hemispheres can at times have two minds; this same con- 
clusion cannot be supported by bisection of the cerebrum horizontally, 
coronally or diagonally, rather than midsagittally. 

(3) That two hemispheres, in normal continuity, do support two minds most 
of the time is not proven, so far as f am aware. How often and when do two 
hemispheres in continuity support two minds? How does an individual 
remain integrated when the cerebral commissures are cut? It is questions 
such as these, not those raised by Robinson, which concern most of us who 
are familiar with the facts. 

JOSEPH E. BOGEN 
Ross Loos Medical Group 
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Reviews 


Rescuer, NICHOLAS [1973]: Conceptual Idealism, „Oxford: Basil Blackwell. 
£3.95. Pp. xiii+ 197. 


This is an enthusiastic and intriguing work, even though obscure at one or two 
central points and even though Rescher leaves a number of unanswered and 
pressing questions. It also has something of a pioneering spirit in defending a 
form of thought that is out of fashion. Rescher’s brand of idealism differs inevi- 
tably ix many ways from earlier brands (Hegel, Bradley, McTaggart et al.) 
though falling squarely under the general theme of idealism as the view that 
` mind has a very central role in the constitution of the world. Rescher’s is a 
conceptual idealism which maintains the centraligy of mind in our conception of 
the world. Specifically Rescher seeks to show that the basic features in terms of 
which our conceptual scheme organises the world—as a plurality of concrete 
particulars located in space and time and interacting causally—are all of them 
imbued with an essential (though tacit) reference to minds and their capabilities. 
Such an idealism he usefully contrasts with rationalism as the view that the 
knower makes an active contribution to the constitution of knowledge; con- 
ceptual idealism adds that the form of this contribution is such as to make 
essential reference to mind itself. 

The leading influences behind this line of thought Rescher says are Kant, the 
later Hegelians, and Peirce. His departures from Kant are perhaps more striking 
than his adherences, for the a priori element identified in our conceptual scheme 
(the mental reference) is so only in the sense of being given by us. There is no 
attempt at a transcendental argument to show the inevitability of such references 
(pp. 139-41, 171-4); indeed the debt to Peirce comes from the recognition of a 
pragmatic justification of our conceptual framework. His debt to Anglo- 
American neo-Hegelianism lies in his coherentist criterion of ‘truth about 
reality’ (pp. 166-71): objective truth cannot be seen as correspondence with 
an sich reality (something altogether extra-mental) but “is determined wholly 
on the phenomenal side”. (Rescher regt8 his case for such a criterion on his 
full-blown idealism, yet this would rather seem a consequence more directly 
of rationalism as he defines it). Finally Rescher’s indebtedness to the central 
trend in philosophy is shown by his concentration on our conceptual scheme and 
the correct account of its fundamental conceptual machinery. His view is (pp. 
6-9) that such a concern is warranted because philosophical problems arise in 
and relate to that standard conceptual framework, a characterisation which 
surely gainsays any revisionary metaphysics (including Bradleian idealism). 

For Rescher a concept is mind-involving if “its full and adequate explication— 
not just on the side of its semantical meaning-content, but also on that of its 
applicability-conditions—cannot be carried out without some reference to those 
functions which ... are characteristic capabilities of minds” (p. 16). The 
official plan is to argue from the mind-invokingness of hypothetical possibilities 
to that of lawfulness, and from this, to that of particularity, space, time, and 
the properties of real particulars (p. 26). This plan is not faithfully adhered to, 
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for as the work develops Rescher is faced more and more with an embarrassment 
of possible arguments to carry his case and often relies on more than one route 
to his conclusions. His treatment of possibility is nevertheless central to the task. 

Rescher distinguishes among types of possibilities the unrealised possible 
states of actual things, the dispositional possibilities of actual things, and the 
mere hypothetical possibilities, i.e. states of non-existent things. The funda- 
mental claim is that: the ontological status of mere possibilities is as objects of 
thought (p. 30)—unrealised. possibility is mind-dependent in a strong ontological 
sense. This dependence is not on actually being conceived but on being con- 
ceivable, and not by any particular mind (pp. 36-7). Indeed such possibilities 
exist so long as the descriptive mechanisms of language allow their expression 
(p. 42). Rescher does not, consider whether actual descriptive mechanisms 
suffice to pick out all possibilities, for finer descriptive distinctions than those 
actually made might be thought to pick out possible states also. Turning to the - 
other sorts of possibility, Rescher argues (pp. 47~51) for the thesis that these 
are mind-involving in the standard sense. Not only does he fail to explain why 
possibility in general is not ontologically mind-dependent, but the argument 
for the weaker thesis is obscure. It rests on the premiss that unrealised states 
of affairs cannot be ostensively indicated, but must be introduced by way of 
assumption or hypothesis (which are mental activities). A more basic premiss 
is that “reality is invariably categorical and actualistic, never iffy and hypo- 
thetical” (p. 49), a thesis which demands greater clarification and defence than . 
it gets. 

The mind-involvingness of lawfulness is taken to be a consequence of two 
features of law statements, their nomic necessity (exhibited in counterfactual 
applications) and hypothetical force, and so a consequence of that of possibility. 
Rescher draws the implication (p. 71) that laws are in significant respects not 
discovered but made, and this introduces another dimension of mind-invoking- 
ness different from (though a consequence of) the possibilistic aspect. He develops 
in some detail an ‘imputational’ theory of lawfulness. In virtue of the features 
mentioned, evidence will always be inadequate to establish law statements 
(pp. 81-2). The lawfulness of a statement must therefore be the product of an 
imputation, whereby we admit its application in certain kinds of modal and 
hypothetical contexts (p. 83). Such an imputation is nevertheless warranted 
by certain well-recognised features of the evidential and systematic epistemic 
situation (p. 86 ff). Rescher wishes to develop this imputational account as a 
methodological device of wider applicability: however, in its application to the 
case at hand the question arises as to whether this theory can give acceptable 
accounts of the discovery or mistaken attribution of lawfulness. 

In briefer outline, the mind-involvingness of particularity is said to follow 
primarily from the connection between particulars and their criteria of identity 
which enable their identification and discrimination (p. 99). Rescher indicates 
a number of other ways of reaching his conclusion, including an argument 
from the connection between particularity and lawfulness (p. 107). ‘Time and 
space alike are claimed to be mind-involving in the strong sense of ontological 
mind-dependence, at least in some of their features. The measurability of space 
and time implies (weaker) mind-involvingness via lawfulness and particularity 
(though whether lawfulness as distinct from mere uniformity of events is 
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demanded is perhaps questionable) but presentness and orientation are taken 
to establish mind-dependence. The mind-involvingness of the properties of 
real particulars is said to follow from their essential dispositionality (p. 143) and 
in the case of observationdlly determinable properties also from the fact that 
their analysis requires a reference to the experien¢ing mind (p. 144). 

Rescher includes a searching chapter on the impligation of his conceptual 
idealism for the idea of'mind-independent reality. Reality. an sich is not to be 
thought of in terms of the fundamental features of our conceptual scheme seen 
to be mind-invoking, and the terra incognita reached by eliminating those 
features “is not a very profitable realm of exploration” (p. 152). On the other 
hand, our conception of the world as experienced is that of the interaction 
between ourselves and the ground of our experience (p. 158 ff) and so some 
room must be available within our theory of nature for such an extra-experiential 
. ground. Rescher makes this room by sketching that ground in admittedly mind- 
involving terms, where it is seen as that basis of experience described in the theories 
of science, a realm of inferred entities which act of our sense-receptors to produce 
the familiar world (pp. 161-3). Rescher is quite correct in distinguishing this 
conception of extra-experiential reality from that of reality an sich, and not 
only is it not inconsistent with out actual conceptualisation of reality it is indeed 
part of it. However this by no means shows that one can ignore the other con- 
ception of reality an sich, for that is as much an essential part of Rescher’s 
. idealism as the Ding an sich is of Kant’s, and (as Rescher is obviously aware) 
there are grave difficulties in elucidating this notion which are consequent on 
the very terms of the theory. 

BRIAN CARR 
University of Exeter 


Sprcror, M. [1972]: Methodological Foundations of Relativistic Mechanics. 
University of Notre Dame Press. : 


According to the author the main purpose of his book is to clarify certain funda- 
mental concepts, principles and procedures in classical and relativistic mechanics. 
However, he also intended to give philosophers with a modest mathematical 
knowledge a “grounding” in relativistic mechanics; but the two objectives 
cannot easily be reconciled within one short book. While he has by and large 
succeeded with regard to the second, he has in my view achieved the former only 
in a rather cursory sort of way. 

The book is mainly concerned with the conceptual relations between relativ- 
istic mechanics on the one hand and classical mechanics and electrodynamics on 
the other. The physical principles of these theories are explained and illustrated; 
the concepts of force, mass and energy as well as the concept of a frame of 
reference, the Galilean and the Lorentz transformations are analysed in a neat 
and clear way without straining the reader’s mathematical ability. The book has 
its limitations, of course, which I mention not so much for the sake of criticising 
Mr Spector’s exposition, but in order to convey in a minimum of words what it 
can and what it cannot provide. Firstly, the concepts of space and time and the 
principles of spatio-temporal measufement, all of which one would assume to be 
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of paramount importance for an adequate account of the metliodological founda- 
tions of relativistic mechanics, are hardly gone into (“this requires a mathe- 
matical background which I have not wished to assume of the reader”, the author 
explains). Secondly, only the mechanics of special relativity is dealt with; and 
in view of the fact that evengan excursion into Minkowski space is considered 
too risky, the reader canngt hope to be properly prepared for the point of view 
of Einstein’s general theory of relativity. Thirdly, a number of philosophical 
problems are commented upon, e.g. reduction of electrodynamics to mechanics, 
the role of conventions in mechanical explanation, and questions of meaning 
shift (arising from the transition to the relativistic concept of energy); but none 
of these philosophical problems is considered in much detail. 

The author’s claim tb have analysed “certain concepts and principles in a 
manner which disagrees with standard interpretations” I find difficult to 
evaluate, partly because I am uncertain about what does and what does not 
qualify as a standard (especially a standard philosophical) interpretation. How- 
ever, with regard to what constitutes perhaps the rock bottom of such an exposi- 
tion, namely the interpretation and derivation of the Lorentz transformations, 
there appears to be little deviation from well-known patterns: the transformations 
are conceived of as connecting space-time coordinates obtained in different 
inertial frames and are derived from the conjunction of the light principle and the 
special principle of relativity. The author remarks that there is no incompatibility 
between the latter principle and the existence of an ether frame (i.e. a frame at | 
rest relative to the ether) as long as the laws describing this medium are them- 
selves Lorentz invariant. But this seems to be far from obvious. If a state of 
motion were attributed to the ether, and if relativistic effects such as rod con- 
traction, clock retardation and mass-speed relationship were given a causal 
explanation in terms of the relative motion of the ether and the systems con- 
cerned, then it would be difficult to maintain the Lorentz invariance of these laws 
of interactidén between the systems and the ether. A clock at rest in an ether 
frame F and a clock at rest in another inertial frame F’ which is in motion with 
regard to the ether would then be in different physical states (irrespective of 
whether this difference could in fact be discovered, needless to say). To put it 
in another way, Einstein’s convention for space-time measurement would in such 
a case be questionable. Thus although Einstein was prepared to admit a basic 
substratum he was unwilling to grant it-a state of motion. The Lorentz trans- 
formations are indeed not incompatible with the existence of an ether with a 
definite state of motion, provided they are interpreted in a way which does 
not presuppose the special principle of relativity and Einstein’s convention 
for space-time measurement. 

Finally I should draw attention to some mistakes and inaccuracies in Mr 
Spector’s book. His treatment of the damped harmonic motion (pp. 22-3) 
contains a few of them. First, the solution given does not agree with the assumed 
initial condition ¢ = o for t = o. Secondly, the critical term for classifying the 
types of motion is c@—4 mk and not k—c as seems to be suggested; e.g. c<k 
does not imply <4 mk. Moreover, the condition for non-oscillatory motion is 
not that a < o in cos at. Page 35: the analogy drawn between the “problem- 
solving schema” in classical mechanics and in other areas of physics is fine, 
except that the term “state function” is highly misleading in that context, to 
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say the least; it is not the state function but rather an operator equation which 
could be interpreted as a quantum mechanical analogon to a force law in classical 
mechanics. Page 53: the identity e* = cos s+isin z is not correctly stated. 
Page 76: from the fact thatthe Lorentz force law contains the velocity v and v 
is not invariant under a Galileo transformation it ¢annot be concluded that the 
Lorentz force law is not invariant under such a transfprmation. Page go: it is 
not difficult to show that in the Michelson-Morley experiment the times Ty 
and T, can be equal even if u Æ o (u being the speed of the set-up with respect 
to the supposed ether). Inaccurate shorthand references to the history of science 
have become the rule rather than the exception in a certain kind of book. A 
number of times (pp. 33, 40, 50) f = mg is referred to as “‘Galileo’s law of 
gravity”; and the discovery of the conservation of momentum is generously 
_ attributed to Leibniz (p. 127). Relativistic mechanics is said to have been erected 

. by Einstein; but it would have been fair to mention also the name of Planck 
seeing that it was he who introduced what the author takes to be “the relativistic 
force law” (p. 163). Page go: it is of course not e&sy to say what precisely makes 
a theory an ad hoc theory; however, in order to show that Lorentz’s ether theory 
was not ad hoc it is certainly not sufficient to point out that he was able to deduce 
the contraction effect from his “theory of matter”, as this deduction involved 
` an assumption concerning the dimensions and deformations of electrons which 
created difficulties and is generally considered “ad hoc”. 

To summarise: if you are interested in the foundations of relativistic 
mechanics, and if you have done the calculus but forgotten all about it, this may 
well be your book. Even then you may like to put first things first and start with 
Einstein’s simple, profound and, in this respect, still unsurpassed classic. 

ALFONS GRIEDER 
The City University 


Kress, H. A. and SHELLEY, J. H. (eds.) [1971]: The Creative Process in Science 
and Medicine. Amsterdam: Excerpta Medica. New York: Elsevier. 


The idea behind this symposium was apparently to illuminate scientific creativity 
by bringing together some highly creative.scientists for a discussion on the topic. 
Because creativity is such a vague and ambiguous concept a delimitation of the 
problem is particularly important in this case. Presumably it was a belief in free 
discussion which kept the organisers from imposing a stricter framework. The 
result, however, is that major parts of the symposium have only a tenuous con- 
nection with the announced topic. 

The first session is philosophical, on ‘The Analysis of Scientific Method’, 
chaired by Karl Popper. A brief introduction by Jaques Monod strikes some 
characteristic Popperian themes like the demarcation of science and the origin 
and role of theoretical problems in scientific research. In the discussion, however, 
the ethologist Desmond Morris soon introduces considerations of creativity in 
animals and children. And this turns out to be the most discussed topic of the 
symposium. 

Morris’s ethological line is continued by Nicholas Tinbergen’s introduction 
to the second session, ‘Patterns of Creativity in Animal and Human Behaviour’. 
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Tinbergen stresses that one must start the study of scientific creativity by 
investigating the biological basis of creativity as it appears in the general creative 
abilities of children and animals. He believes so firmly in a ‘deep-rooted, genetical 
basis of creative talent’ (p. 41) that he cannot see’more than a ‘difference of 
degree’ between the highly s&phisticated activities of human art and science and 
‘the primitive paintings that have been produced by chimpanzees’ (p. 40). 
Jaques Monod expresses doubts that this discussion `of children and animals 
touches on the essential problems of creativity in science (p. 51). Quite rightly 
in the eyes of this reviewer. But Tinbergen’s piece is appropriate enough to 
the announced topic of the session. 

The third session, ‘Creativity in the Biological Sciences’, is introduced by 
John Eccles with a popular lecture on current problems of neurophysiology. 
But he does very little to relate this material to the problem of scientific creativity. , 
Though John Eccles’s work contains excellent examples of scientific creativity, + 
a mere exhibition or description of creativity contributes little to our under- 
standing of it. 

In the fourth and last session, ‘The Dynamics of Creativity’, the psycho- 
therapist Charles A. Storr discusses the importance of the personality structure 
of scientists. But again it is doubtful, for instance, whether the psychoanalytic 
explanation that he suggests for Newton’s behaviour, is specific enough to 
illuminate scientific creativity. 

This may well have been a fruitful symposium for the participants. There are. 
many interesting ideas and pertinent remarks. But as a whole the report is too 
incoherent and superficial in its treatment of the many diverse topics it touches 
on to be worth publishing as a book. One cannot help suspecting that a major 
reason for its publication has been the illustrious list of participants: among a 
total of twenty-four there are five Nobel Prize winners. 


NILS ROLL-HANSEN 
The Norwegian Research Council for Science and the Humanities, Oslo 
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SIXTH INTERNATIONAL CONGRESS OF LOGIC 


The sixth Internationgl Congress of Logic, Methodology and Philosophy of 
Science will be held in Hannover, Federal Republic of Germany, from,22 to 29 
August 1979, under the auspices of the International Union of History and , 
Philosophy of Science (Division of Logic, Methodology and Philosophy of - 
Science) and sponsored by the German Research Council (DFG) and the Land 
Niedersachsen. The congress Will include the following fourteen sections: 


. Proof theory and foundations of mathematics, 

. Model theory and its applications, 

. Recursion theory and theory of computation, 
Axiomatic set theory, 

. Philosophical logic, 

General methodology of science, 

. Foundations of probability and induction, 

. Foundations and philosophy of the physical sciences, 
. Foundations and philosophy of biology, 

. Foundations and philosophy of psychology, 

. Foundations and philosophy of the social sciences, 
. Foundations and philosophy of linguistics, 

13. History of LMPS, 

14. Fundamental principles of the ethics of science. 
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The congress technical sessions will consist of a number of invited addresses 
and symposia, in addition to brief contributed papers. 

The first circular with information about registration fee, accommodation 
and deadline for the receipt of abstracts will be mailed by the beginning of 1978. 
It can be obtained from Sekretariat des Internationalen Kongresses fiir Logik, 
Methodologie und Philosophie der Wissenschaften, Welfengarten 1, D-3000 
Hannover / BRD. 
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SIXTH BIENNIAL MEETING OF THE PHILOSOPHY OF 
SCIENCE ASSOCIATION 


‘ The Philosophy of Science Association will hold its Sixth Biennial Meeting at 

` the Jack Tar Hotel, San Francisco, on 28-30 October 1978. The program will 

include symposia and invited papers as well as sessions devoted to the presenta- 

- tion of contributed papers. Contributed papers will-be pre-printed as the first 

» volume of PSA x978. The symposia and invited papers will be printed later as 
` the second volume. 

The submission of contributed papers is invitgd. Contributed papers on any 
topic in the philosophy of science are welcome. They may be written from any 
philosophical standpoint. Maximum length is 3500 words and the closing date 
for submission is 1 March 1978. Two copies, each including a 100 word abstract, 

« should be prepared in double space typescript. Blind refereeing will be used so 
that the author’s name and institution must be on a separate cover page. These 
materials should be sent to the Chair of the program committee, Ian Hacking, 

*Department of Philosophy, Stanford, California 94305, U.S.A. PSA is primarily 
interested in papers that the authors believe are ready for publication. However, 
papers involving work in progress which the author wishes to present at the 
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. Natural Kinds* 


by D. H. MELLOR 
x Some notable philosophers have recently used new arguments to 
revive essentialism, and have prescribed their essences for a variety of 
2 metaphysical fears and ailments. The essence of a'self -has been said to 
guarantee its ancestry; mental essence has been promoted as a sure defence 
‘| against materialism; and diamonds have been warranted in all possible 
worlds against being paste. I mistrust these prescriptions, especially the 
claims made for their active ingredients: possible worlds and necessary 
identity. However, I don’t mean here to resist all applications of these 
notions, nor to dispute all forms of essentialism. I mean only to supply an 
* antidote to the natural kind essences widely advertised by Professors 
Kripke ([1971], [1972]) and Putnam [1975]. 

Kripke and Putnam claim that natural kinds have essential properties; 
that is, properties which nothing can lack and still be of the kind. The 
kinds involved include the traditional natural kinds: elements and com- 
- pounds like gold and water, and biological and botanical species like tigers 

and elm trees. Modern essences, however, come in a wider range. Essential 
properties are claimed also, for example, for temperatures and lengths.? 
These do not traditionally form natural kinds, but it will be convenient 
here to stretch the term to match the doctrine. 

Properties alleged to be essential typically involve the microstructure 
of things. Having atomic number 79 is said to be the essential property of . 
gold (Kripke [1972], pp. 327), being H,O the essential property of water 
(Putnam [1975], p. 233). Genetic makeup similarly provides essential 
properties for animals and plants,? and mean molecular kinetic energy for 


ki 


` 
’ 


Recajved 3 November 1976 

* Versions of this paper have been given to seminars at Columbia, Stanford, the Australian 
National, Cambridge and Oxford Universities; and in August 1975 to the Annual 
Conference of the Australasian Association of Philosophy. I am much indebted to all 
those who have commented on the Paper on these and other occasions. 

1 ‘It is going to be necessary that heat is the motion’ of molecules’ (Kripke [1971], p. on 
The proposition “The standard meter rod (S) is 1 meter long’ Kripke ([1972], p. 275) 
takes to be a contingent a priori truth—e priori because S sets the standard, contingent 
because S might have been shorter, longer or non-existent. But if S in fact fixed the 
extensiqn of ‘r metre’ as Putnam prescribes (see below), then whatever shared properties 
in fact make distant objects the same length as S would be essential properties of being 
k metre long. 


tats, but “fools cats” ’ (Kripke [1972], p. 321). 


U 


. animals with the appearance of cats Hut reptilic internal structure . . . would not be me 
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temperature. These essential properties of natural kinds are supplied by 
the natural sciences: ‘In general, science attempts, by investigating basic 
structural traits, to find the nature, and thus the essence (in the philo- 
sophical sense), of the kinds (Kripke [1972], p. 330). 

The necessity of essential properties is metaphysical, not epistemic. 
The claim is that things of a kind have its essential properties in all possible 
worlds, not that its essential properties are knowable a priori.1 In particular 
it is not supposed to be analytic to ascribe its essential properties to things 
of a kind, Kinetic theory gives the essence of temperature, not the meaning 
of ‘temperature’. Essentialism needn’t therefore dispute Quinean critiques 
of the analytic-synthetic distinction on the one hand, nor on the other , 
need it plague theoretical conflict with the problems of incommensurability ° 
(Feyerabend [1962], Kuhn [¢962]) or indeterminacy of translation (Quine 
[1960]): 


Note that on the present view, scientific discoveries of species essence do not 
constitute a ‘change of meaning’ . . . We need not ever assume that the bio- 
logist’s denial that whales are fish shows his ‘concept of fishhood’ to be different 
from that of the layman; he simply corrects the layman, discovering that 
‘whales are mammals, not fish’ is a necessary truth. Neither ‘whales are mammals’ 
nor ‘whales are fish’ was supposed to be a priort or analytic in any case (Kripke 
[1972], P- 330). 


So proponents of rival theories are not doomed by essentialism to Kuhn’s 
[1970] and Feyerabend’s [1970] dialogue of the deaf. We need not, 
therefore, ‘continue that prolonged dialogue here. 


2 Essentialism about kinds has various sources. It derives partly from the 
plausibility of examples such as those given above. Some among the 
properties common to things of a kind undoubtedly matter more than 
others. In particular, some will be more central than others to a theory 
which explains the properties and relations of things of that kind. I con- 
sider in section 7 whether properties being important in that sense is 
in fact best explained by, and thereby lends support to, the claim that 
they are essential properties of the kind; and I conclude that it is not. 
Individual essences are a classic source of essentialism about kinds. 
Some kinds may provide criteria for the reidentification of things of that 
kind, such that no thing of the kind could survive change in the specified 
respect. It is arguable, for example, that a man could not survive the loss 
1 Kripke [z971], p pp. 150-1; [1972], pp. 260-3. I have used Dummett’s term for the latter 
notion, since I incline to accept his account, on which “epistemic necessity is a stronger 
notion... a statement may be [metaphysically] but not epistemically necessary, but the 


converse could not occur, Kripke, however, tlaims thé properties of being a priori and 
being necessary to be quite independent” (Dummett [1973] p. 121). 
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or drastic transformation of the body whose spatio-temporal continuity 
settles questions of human reidentification. I don’t think that is true, but 
even if it were, it would, follow neither that any men, nor that all men, 
must have bodies of the kind specified. In the first place, John’s inability 
as a man to become a beetle is compatible with the possibility of his always 
having been one; in the second, there could still have been men (other 
than the men there actually are) who lacked these bodily features and 
would be reidentified over time by different criteria. Anyway, arguments 
from individual essence, if they worked at all, would not work for kinds, 
such as water and gold, that provide no criteria for'the reidentification of 
things. Perhaps there are characteristics a gold cup cannot come to have, 
*, but that will not show that other gold objects cannot have them. 

There are, I believe, no sound inferences from individual essences to 
kind essences: but that point is not new, and I need not argue it further 
here. My concern here is with two other, newly fashionable, arguments 
for essentialism about kinds. One, due to Putnam [1975], derives essen- 
tialism directly from a theory about how the extension of kind terms is 
fixed. The other, due to Kripke ([1971], [1972]), derives it indirectly via 
a theory of the singular reference of natural kind and other seemingly 
general terms, from whose necessary self-identity essentialism is taken to 
follow: 


When we have discovered that heat is molecular motion we’ve discovered an 
identification which gives us an essential property of this phenomenon. We 
have discovered a phenomenon which in all possible worlds will bé molecular 
motion — which could not have failed to be molecular motion, because that’s 
what the phenomenon is (Kripke [1972], p. 326). 


Putnam’s theory of the extension of kind terms, and Kripke’s theory of 
their reference, are alike in denying traditional accounts that make the 
reference (or extension) of terms a function inter alia of something like 
their Fregean sense.1 As applied to kinds in particular, the new theories 
deny that the extension of kind terms is any function of descriptions 
believed by their users to be true of things of the kind (Putnam [1975], 
p. 221). Fregeans, who believe the contrary, need not of course deny that 
there are non-analytic essences of kinds (Dummett [1973], p. 117), but 
Fregean theories of how kind terms get.their extension give no especial 
reason to think there are any. Fregean theories in general, and description 
theories of kind terms in particular, yield necessity only as a by-product 


1 I say “nter alia’ because Fregean reference or extension is obviously a function also of 
context (e.g. in indirect speech, according to Frege, a name refers to what is normally 
its sense) and of what the world contains ge.g. whether ‘gold’ applies to my tiepin depends 
on whether my tiepin is gold), Cf. Dummett [1973], ch. 5, 9. 
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of analyticity. (Had our Fregean sense of ‘water’ made us apply it only to 
what we believe ‘H,O’ applies to, then—at least for us in this world— 
water would in all possible worlds have been H,O. But that, as we have 
seen, is not why essentialists think being H,O is of the essence of water.) 

To provide essence without analyticity, an alternative is needed to the 
Fregean sense of what water is. Putnam to that end tells two tales designed 
to bury Frege, as a prelude to recommending his own essentialist account 
of natural kinds. In fact, Putnam buries Frege alive and well; and that 
fact must be shown first, before we can profitably turn to the deficiencies 
of Putnam’s and Kripke’s rival theories. (My arguments against Putnam’s 
interpretation of his two tales overlap with those of Zymach [1976], but 
my further purposes make it desirable to restate them here in my own . 


way.) 


3 Putnam’s tales are aimed at the idea of a kind term’s extension being 
any Fregean function of its users’ beliefs. The tales therefore present 
cases where such an extension differs for two groups of users with 
relevantly identical beliefs. First, we suppose a Twin Earth somewhere, 
which is just like Earth except for a different microstructure, XYZ, of * 
what they too call ‘water’. Macroscopically XYZ is indistinguishable from 
H,O, and it plays just the same part in Twin Earth life that H,O does 
here. By 1950, however, it has become common knowledge on each planet 
that the other lives off different stuff. But back in 1750 no one knew about 
the microstructure of water, and each planet had identical beliefs about 
the stuff they so called. Yet “the extension of the term ‘water’ was just 
as much H,O on Earth in 1750 as in 1950; and the extension of the term 
‘water’ was just as much XYZ on Twin Earth in 1750 as in 1950” (Putnam 
[1975], p- 224). Now the local stuff was no doubt in the extension of ‘water’ 
as used on each planet in 1750. If thé stuff on the other planet was different 
in kind, it was presumably not in fact in the term’s extension, even though 
the users then would have mistakenly thought it was. Hence the anti- 
Fregean conclusion: ‘water’ can have different extensions in the same 
world for different users who give it the same Fregean sense. 

I agree that ‘water’ had (tenselessly) the same extension in 1750 as it 
had in 1950; what I deny is that at either time that extension was different 
on Earth and on Twin Earth. The fact that Twin Earth’s 1950 beliefs 
about local water differed from ours doesn’t begin to show that the exten- 
sion of their term ‘water’ differed from that of ours. It doesn’t even 
follow that the senses of the term differed; and if they did, the whole 
point of the sense/reference distinction, is to allow sameness of reference 
(or extension) to accompany difference of sense. It is indeed quite plain to 
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my Fregean eye that in 1950, as in 1750, ‘water’ had the same extension 
on Twin Earth as it had here. There was water on both planets alike, 
and there still is. We simply discovered that not all water has the same 
microstructure; why should it? Because its microstructure is an essential 
property of water? Well, that is what’s in questio. 

Fregeans need not resort to science fiction to recommend their reading 
of this tale. There is a perfect precedent in the discovery of isotopes. If 
Zymach’s ([1976], p. 120) heavy waters are too rare and exotic to convince, 
try the two common isotopes of chlorine. Note that in these real cases 
the various isotopes occur together in natural samples; they aren’t segre- 
gated onto separate planets. It is therefore undeniable that the extension 
"of ‘chlorine’ included both isotopes before their discovery, and so presum- 
ably includes both isotopes now.! What Putram must say is that chlorine 
and water have been found not to be natural kinds after all, but rather 
mixtures of natural kinds. But in that case, as Zymach ([1976], p. 122) 
observes, it will very likely turn out that we have no natural kind terms. 
Anyway, the pertinent point is that the first Twin Earth tale doesn’t 
compel that conclusion, which it would have to do to dispose of Frege. 
` The Fregean reading that Putnam overlooks is prima facie at least as 
plausible as his own. 

Putnam, however, has another Twin Earth tale that won’t admit this 
Fregean reading. This time we suppose that aluminium and molybdenum 
are practically indistinguishable, that molybdenum is as common on Twin 
Earth as aluminium is here, and consequently that on Twin Earth all our 
Earthly uses of the two metals are interchanged. Moreover, like Americans, 
Twin Earth men don’t call aluminium ‘aluminium’; they (unlike Ameri- 
cans) call it ‘molybdenum’. The term ‘aluminium’ they reserve for 
molybdenum. Now most people, both here and there, can’t tell the metals 
apart; call these people ‘laymen’. Laymen here are therefore in the same 
psychological state about aluminium that laymen there are about what 
they also call ‘aluminium’. Yet the extension of ‘aluminium’ as laymen 
use the term there is indubitably molybdenum and thus quite different 
from the extension of ‘aluminium’ as laymen use the term here. 

My Fregean waterworks, chlorinated or not, will not wash with this 
tale. Our laymen know that aluminium isn’t molybdenum, even if they 
can’t tell what the difference is. So we can’t make our term ‘aluminium’ 
apply also to Twin Earth molybdenum as we made ‘water’ apply also 
1 The sense of ‘chlorine’ might of course have changed in the wake of this discovery, to 

make the term apply only to the more common isotope; just as the sense of ‘water’ might 
have changed to exclude XYZ,or D,O. But that is not what happened, and anyway not 


what Putnam needs, If changes of belief about the microstructure of kinds do produce 
changes of extension in kind terms, that rather recommends a Fregean view. 
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to XYZ. The reason of course is that there are high priests as well as lay- 
men, experts who can tell the difference. Our experts fix the extensions of 
‘aluminium’ and ‘molybdenum’ for laymen here; experts on Twin Earth 
fix them the other way round for laymen there. There is, as Putnam 
([1975], p- 227) puts ity æ “division of linguistic labor”. 

Very well. It need not be my beliefs that fix the reference or extension 
of terms which I can use quite well in my limited way. So I defer to experts, 
whose job it is to say what such a term really applies to. The reference or 
extension in any possible world of the term as we use it may nevertheless 
still be some Fregean function of our experts’ beliefs. In Australia, for 
example, to be “back of Bourke” is to be way out in the outback. That , 
Bourke is at an edge of Australian civilisation is all I know about that - 
place, certainly not enough tọ enable me to tell Bourke from several other 
places. Yet I can still refer to Bourke, as I have just done, by taking it to 
be whatever place would best fit more expert geographers’ beliefs about it. 
Our knowledge of most things and places must be like this; certainly 
most of our knowledge of natural kinds. So no doubt the labour of reference 
is divided, as Putnam says, but it may be a Fregean labour for all that 


(Dummett [1973], pp. 138-9; [1974], pp. 530-1). 


4 Fregeans can cope with Putnam’s Twin Earth tales. Frege is still in 
the ring; so how does Putnam fare on points? His rival theory gives the 
extension of natural kind terms in two stages. First, archetypes in this 
world, paradigm specimens of the kind. Then anything, in any possible 
world,.that has a suitable “same-kind” relation to the archetypes. It is for 
science to tell us what the same-kind relation is for any category of kinds 
(for H,O, Putnam proposes, implausibly enough, a same-liquid relation). 
Generally, Putnam assumes that the relation will specify a shared micro- 
structure. But whatever shared properties Putnam’s same-kind relation 
picks out will be essential properties of the kind, since the relation is 
assumed to be an equivalence relation that holds across all possible worlds 
(Putnam [1975], p. 232), not just in this one. So not just actual specimens 
of the kind share the specified properties: nothing could be of the kind 
and not share them with the kind’s archetypes. 

Putnam’s necessity is metaphysical, not epistemic. A natural kind may 
be known long before its essential microstructure is known. Putnam’s 
theory is radically anti-Fregean: given archetypes, a kind term’s extension 
is fixed by its same-kind relation, regardless of what anyone believes. 

Putnam’s argument for kind essences credits kinds both with archetypes 
and with cross-world equivalence relations holding between all things of 
the kind. Real natural kinds need have neither. Take archetypes. Their 
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role is to fix the kind term’s extension without recourse to its Fregean sense. 
Putnam takes it that they must therefore be in this world, so that they can 
be picked out ostensively. Thus pointing out Lake Michigan as an arche- 
typal sample of water (Putnam’s example!) fixes the extension of ‘water’ 
regardless of its sense (just as naming the lake "Lake Michigan’ is supposed 
to fix the reference of that expression regardless bf its sense). Then the 
rest of the extension of ‘water’, in any possible world, is just what stands 
in the relevant same-kind relation to this archetype. That is why Putnam 
([1975], p- 234) says that kind terms are ‘indexical’, like ‘I’, ‘now’, ‘here’ 
and ‘this’. ; 

Ostensive reference, to just this archetype in this world, is thus essential 
to the mechanism of Putnam’s essentialist theory. He must show, there- 
fore, that our use of kind terms actually incorporates it. An extension 
of Kripke’s causal theory of naming offers to show this. The theory has 
some irrelevantly contentious aspects, about how users of names pass 
them ont; all we need here is that roughly, some archetype must be 
causally “upwind” of any use of a natural kind term. Thus our uses of 
‘water’ and ‘aluminium’ are supposed to derive causally from our (or our 
experts’) causal acquaintance with archetypal specimens of H,O and 
aluminium respectively; and that is supposed to be why H,O and alumin- 
ium are what we refer to by those terms. The corresponding Twin Earth 
uses derived causally from archetypes of XYZ and molybdenum; which 
is why those are the kinds they refer to by the same terms. 

That is the theory. Unfortunately, archetypes do not constrain our use 
of natural kind terms in that way. True, botanists designate type specimens 
of plant species, and geneticists designate cultures to exemplify gene-types 
(Jardine [1975]). But these specimens are causally downwind of the usage 
they are supposed to constrain. They are chosen to fit botanical and genetic 
knowledge, not the other way round. They are certainly not the specimens 
whose classification caused the corresponding kind terms to be used in the 
first place; they may well indeed not even be of the same kind as those 
specimens. Hence, as Jardine [1975] and Zymach ([1976], pp. 123-4) 


1 “When the name is “passed from link to link”, the receiver of the name must, I think, 
intend when he learns it to use it with the same reference as the man from whom he 
heard it’ (Kripke [1972], p. 302). Intention isn’t enough, however: ‘I am asked to name 
a capital city and I say ‘Kingston is the capital of Jamaica’; . . . [I] said something strictly 
and literally true even though it turns out that the man from whom I picked up this 
scrap of information was actually referring to Kingston-upon-Thames and making a 
racist observation’ (G. Evans [1973], p. 194). ‘We are left with this: that a name 
refers to an object if there exists a chain of communication, stretching back to the intro- 
duction of the name as standing for that object, at each stage of which there was a 
successful intention to preserve its reference. This proposition is indisputably true; but 
hardly illuminating’ (Dummatt [1973] p. 151). See also J. E. J. Altham [1973], pp. 
209-25. 
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have observed, our most authoritative specimens of a kind might on Put- 
nam’s account not even be of that kind. 

So some archetypal natural kinds have the wrong archetypes; others 
have none at all. Consider elements high in the periodic table, that do not 
occur in nature and hayeenever been made. We have names for them, but 
there may never be archetypes to constrain our use of the names. Even 
if specimens eventually appear, the discovery, creation or synthesis of 
previously unknown fundamental particles, elements and compounds can 
surely be predicted. The term ‘neutrino’ applied to just the same particles 
when it was used to predict their existence as it has applied to since their 
discovery. Ostensive reference (say to a bubble chamber photograph) 
could not have fixed its extension then; why suppose exactly the same 
extension is fixed that way ngw? 

Even if we were to grant Putnam his archetypes, however, his essential- 
ism would still fail to follow. No reason is given why particular properties 
must be common to all things in all possible worlds that are of the same 
kind as the archetypes. Suppose that all samples of water in fact share 
ten “important” properties, but that water could lack any one of them, 
so that only the disjunction of all conjunctions of nine of them is essential. 
In this case of course sameness of kind is not the equivalence relation 
Putnam ([1975], p. 231) says it is, since it is not transitive; two merely 
possible samples of water could differ in two of the ten properties. But 
Putnam’s account doesn’t in fact provide transitivity, since what makes 
things water in other possible worlds is their likeness to archetypes in 
this world, not their likeness to each other. To claim that the relation ts 
an equivalence relation, so that archetypes have to share the same 
properties with all possible samples of the kind, is just gratuitously to’ 
assume the essentialist conclusion. 


5 Putnam’s account of the extension of kind terms both fails to be true 
and fails to entail essentialism. I turn now to Kripke’s derivation of essen- 
tialism from the necessary self-identity of natural kinds. In evaluating his 
argument it is essential to keep the reference of a kind term clearly dis- 
tinguished from its extension; to which end I hereafter distinguish the 
supposedly singular term ‘water’ from the corresponding predicate 
‘, . . is water’, and likewise for other kind terms. Now Kripke ([1972], 
p- 349) admits that a causal mechanism is not needed to secure the reference 
of natural kind terms. ‘Neutrino’ could be introduced, as it was, by 
theoretical description and still be applied “rigidly” in Kripke’s ‘sense. 
That is, we can still consider the consequences of just that kind of particle 
failing in this or that respect to satisfy the theoretical descriptions that in 
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fact served to identify it. ‘Neutrino’, so understood, is for Kripke a non- 
Fregean rigid designator, since its reference in other possible worlds is 
not constrained to satisfy these theoretical descriptions, which must be 
supposed to provide its Fregean sense. For this to work, of course, there 
must actually be neutrinos, near enough as speoifigd. They needn’t be in 
the observed past, to serve as archetypes; but they do need to be somewhere 
in this world, past, present or future. Otherwise the requisite uniqueness 
of reference would not be secured. Many different kinds of particle will 
satisfy our theoretical descriptions of neutrinos in various possible worlds; 

and nothing but the reality of one of these will single it out as the unique 
referent of the term ‘neutrino’. Were there in fact no neutrinos, the term 


". could for Kripke no more designate a natural kind than ‘unicorn’ can 


(Kripke [1972], p. 763). 


However, there are neutrinos, just as there i is H,O. Let us therefore 
grant for the moment that ‘H,O’, like ‘neutrino’, can be made a non- 
Fregean rigid designator by theoretical description and Kripkean fiat. 
Grant also that ‘water’ is such a designator, in this case perhaps even 
because our use of the term derives causally from Putnamesque archetypes 
like’ Lake Michigan. What more is needed for ‘water’ and ‘H,O’ to desig- 


-nate the same kind? And how is sameness of kind related to coextensiveness 


in the corresponding predicates ‘. . . is water’ and ‘. . . is H,O’, which is 
what matters for essentialism? 

We must in fact tackle the latter question first in order to answer the 
former. As referents of singular terms, kinds are obscure entities, not 
notably clearer than properties or attributes in their criteria of identity. 
If the necessary self-identity of kinds is to have any implications for 
essentialism, these criteria will have to be spelled out in terms of predicate 
extensions. One such criterion presents itself at once. For water to be the 
same natural kind as H,O, it seems at least to be necesssary for ‘. . . is 
water’ and ‘... is H,O’ to be coextensive in all possible worlds; otherwise 
some possible world would contain something whose membership of the 
kind depended on what the kind was called, which seems implausible. 

But then it looks as if “Water is H,O’, construed as an identity statement 
linking rigid designators, already and trivially entails that all samples of 
water in all possible worlds are also samples of H,O. Far from the necessity 
of this identity establishing that being H,O is an essential property of 
water, that is just what must be the case for the identity claim to be true 
at all. It is indeed not clear what more the metaphysical necessity of 
‘Water is H,O’ could consist in, since we have already used up our possible 
worlds in saying what makes it true. No doubt it ts necessary if true; but 
only because if in this world ‘. . . is water’ and ‘. . . is H,O’ are coextensive- 
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in-all-worlds, so they are in all other worlds.1 Because the identity criteria 
of kinds have thus to be given in terms of their extensions, there is no 
useful inference to essentialism from the necessity of identity, even if the 
kind terms involved are admitted to be rigid designators; if anything, the 
inference will be the other way. 

One might try arguing, however, that the requirements of natural 
kind identity have been pitched too high. Perhaps what is needed is 
coextensiveness, not in all possible worlds, but in this world and in those 
nearest to it. After all, science is supposed to give us essences; yet the most 
scientists can show us in fact is lawful coextensiveness between ‘. . . is 
water’ and ‘. . . is H,O’: That is, we suppose they can show that not only 
are all samples of water samples of H,O, but that if anything were a sample 
of water it would be a sample,of H,O. Now that need not be a claim about 
all possible worlds, since the consequent of a true subjunctive conditional 
need not be true in all possible worlds in which the antecedent is true; 
it need only be true in those worlds which are sufficiently like ours (Lewis 
[1973]). Waiving the difficulties of characterising lawfulness in terms of 
truth in nearby worlds without begging the question, suppose for the sake 
of argument that the lawfulness of ‘All and only samples of water’are ° 
samples of H,O’, so characterised, suffices for the identity of water and 
H,O. Will the rigidity of ‘water’ and ‘H,O’ now secure necessity for this 
identity and thus coextensiveness in all worlds, however remote, for 
“... i8 water’ and ‘...is H,O’? 

To see that it will not do the latter, we must appreciate how loose the 
connection is between the reference of ‘water’ (say) and the extension of 
‘, . . is water’ in various possible worlds. Obviously there could be much 
more water than there is, or much less. That is to say, there are possible 
worlds with samples of water that don’t exist in this world; and others in 
which some real watery individuals (like Lake Michigan) don’t exist at all. 
Consequently, in yet other possible worlds, there is water all right, al- 
though no individual sample of it is identical with any of the real samples 
that we have. In each of these worlds the singular term ‘water’ has the same 
reference (namely, of course, water—whatever that is), while the extension 
of the predicate ‘. . . is water’ may differ totally from what it is here or in 
other possible worlds. 

Now we are at present supposing, for the sake of Kripke’s argument, 
that coextensiveness between ‘. . . is water’ and ‘.. . is HO’ in all worlds 
is not required for the identity of water and H,O, only coextensiveness in 


1] ignore as incredible accounts of metaphysical (as opposed to epistemic) necessity in 
which this does not follow, i.e. in which the acoessibility felation between possible worlds 
is not transitive (Hughes and Cresswell [1968]); but see below. 
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this and some suitable class of nearby worlds. We now see that in other 
worlds the extension of ‘. . . is water’ may be quite different from its 
extensions hereabouts; and so may that of ‘. . . is HO’. And whatever the 
necessity of identity may do to secure that ‘watgr’ and ‘H,O’ have the.same 
reference in these other worlds, it will do nothing, to secure that ‘. . . is 


water’ and ‘. . . is H,O” have the same extensions, which is what essential- 
ism needs. 


6 Even if ‘water’, ‘H,O’ and similar kind terms were rigid designators, 
therefore, essentialism would get no help from the ‘supposed necessity of 


` identities like ‘Water is H,O’, however the reference of the singular terms 
. involved is related to the extension of the corresponding predicates. 


But there is anyway no good reason to admit that these terms are non- 
Fregean rigid designators. Kripke implies (i971), pp. 146-8) that without 
rigid designators we would need Lewis’ [1973] “counterparts” in order to 
state counterfactuals about things (and kinds). I share Kripke’s distaste 
for counterparts, but this is a false dichotomy. Fregean names may 
designate the same things or kinds in many possible worlds, namely in all 
those that can be specified by Fregeanly intelligible counterfactuals about 
them. On a “cluster” version of the description theory (e.g. Searle [1958]), 
taking account of Putnam’s division of linguistic labour, such counter- 
factuals can suppose the lack of almost any property of the thing, or of 
things of the kind, and of all such properties attributed by any one non- 
expert speaker. That is quite enough to give a specious appearance of non- 
Fregean rigidity to what may in fact be Fregean names. Of course it is 
trivially true that ‘water’ applies to water whatever intelligible counter- 
factual supposition we make about it. The question is whether something 
like a Fregean sense of ‘water’ limits the range of such suppositions. I 
see no reason to deny that it does, and-equally none to assert that anything 
about water itself makes the antecedents of any such counterfactuals 
necessarily false. 


7 We have seen that essentialism can be extracted neither from Putnam’s 
archetypes, nor from a merely stipulated rigidity of reference via modish 
truisms about identity. The existence of essences in Kripke’s and Putnam’s 
theories is no more than a gratuitous assumption on their part. Its appeal 
lies solely in that of the stock exemplars of essential properties; and that 
appeal, I shall argue in conclusion, is specious. 

In biological species, for example, there is a distinct dearth of suitable 
properties shared even in this world. Capacity to interbreed with fertile 
offspring is the obvious candidate, but even that is well known to lack 
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the transitivity Putnam’s same-kind relation would need in order to yield 
essentialism. It usually doesn’t hold between all and only members of the 
samé species here, never mind in other possible worlds. Nor is coming of 
a common stock plausibly ¢ssential to a species, which could easily have 
been cross-bred from, independent mutations (pace Dummett [1973], 
pp. 143-5). (I fear this suggestion derives partly from the idea (Kripke 
[1972], pp. 312-4) that an offspring’s parents are essential to it; of which 
idea it perhaps suffices to remark that ‘If John hadn’t been a Kennedy he 
wouldn’t have been shot’ is a plainly intelligible contingent statement 


(f. Dummett [1973], P- 132)-) 
Elements and chemical compounds offer more scope for essentialism, 


since all their specimens at least share some properties in this world. Of - 


properties supposed to be shgred in all possible worlds, we have seen that 
microstructure provides the stock exemplars; it is worth asking why. 
Scientists commonly employ a principle of “‘microreduction”,} i.e. roughly 
the principle that properties of things should be explained in terms of the 
properties and relations of their spatial parts. Many properties can indeed 
be so explained; and the assumption that they can underlies standard 
techniques for studying things in convenient (e.g. laboratory) isolation 
from their normal surroundings. 

Microreductive theories are thus relatively easily testable, which is 2 
well-known Popperian virtue in science. It is therefore both good method 
to look for microreductive explanation of kinds of things and an im- 
portantly pervasive fact that they can be found. Where, moreover, such 
explanation is both comprehensive and deductive, we may be able to 
replace reference to things of a kind with reference to their parts. So it is, 
roughly, that reference to gas samples was made eliminable, by the classical 
kinetic theory, in favour of reference to gas particles. 

Suppose we now adopt a Quinean view of ontology, and admit only 
kinds of things that need to be referred to in stating what we know. Since 
kinetic theory makes reference to gas samples redundant, they disappear 
from our ontology. Similarly, let us suppose, with water and gold. Micro- 
reductive theories make reference to anything more than H,O molecules 
and gold atoms redundant. There is, we might say, nothing to water and 
gold but the particles of which they are composed. And this, I reckon, is 
the source of the idea that water would have to be H,O in any possible 
world, because H,O is all there is to water. 

The inference is specious. This way of removing items from our ontology 
requires deducibility. If all the macroscopic properties of a kind ate not 


1 Schlesinger [1963], ch. 2. The next four paragtaphs condense an argument given more 
fully in my [1973], pp. 110-12. 
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logy of ileal programmes in which reconstructions would be both 
rational-and historically true. 

In the first part of this paper I shall try to argue, against Elkana,’ that 
the internal-external distinction is useful for any historiography which 
is to increase our understanding of science, I shall try to show that his 
criticism is misconceived and that the failure to distinguish ‘internal’ from 
‘external’ factors leads to epistemological confusions. I shall also criticise 
some: of his historiographic suggestions and thereby his modified version 
of the methodology of research programmes. In the second part, I shall 
examine critically some ideas of Lakatos connected with his evaluation 
of rival methodologies. It will be also noted ‘that in his attempt to 
construct a general method for the historical test of methodologies, the 
(originally) neutral internal—-external dichotqmy acquires normative impli-. 
cations. I shall reject this meaning-shift together with Lakatos’s formulation 
of the historical test. I shall also express some reservations concerning his 
criteria for progress. 


Let me start by examining Elkana’s argument against the ‘internal- 
external distinction’. I shall quote a few passages in order to show that his 
criticism is mistaken: 

The central problem remains to define criteria for progressive and degenera- 
tive problem shift: for growth and stagnation. Lakatos claims that a research 
programme is progressing as long as its theoretical growth anticipates its empiri- 
cal growth. That is as long as it keeps predicting novel facts with success. It 
stagnates when it only gives post hoc explanations of ‘chance’ discoveries. 

It is in the application of these criteria that the internal-external dithotomy 
is at its weakest, for in order to accept its criteria, one has to assume that stages 
exist when pure experimental work really precedes theoretical work, We even 
have to accept the existence of chance discoyeries.? 


Elkana goes on to deny that ‘empirteal growth ever anticipates theoretical 
growth in the. objective sense or that we are faced with real chance dis- 
coveries which have to be explained’. He summarises his argument (on 
p. 248) as follows: t 
My point is that these criteria (for distinguishing progressive from degener- 
ating problem-shift) are not necessary if one abandons the internal-external 
dichotomy. 
Before I turn to the substance of this criticism let me point out that the 
argument is misdirected. Lakatos’s theory of the growth of knowledge in 
which the criterion for progressive problem-shift plays a central role is not 
dependent on his (later) internal-external historiographic distinction. 


1 Elkana [1974], pp. 247-8. 
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The criteria which Elkana attacks are clearly criteria for progress and they 
have nothing to do with the internal-external dichotomy. Consequently, 
it dées not follow that criteria of progress beeome superfluous if the 
internal—external distinction,is abandoned. Some criteria of progress are 
indispensable for any nqrmative philosophy of science; for the methodology 
of scientific research programmes which aspires to provide a theory for 
the appraisal of competing theoretical systems, they clearly are essential. 
Elkana’s attempt to modify Lakatos’s methodology by dropping its criteria 
for progressive and degenerative problem-shifts misses the main point 
of the whole enterprise. Elkana’s position is quite strange, for his criticism 
(if valid) would undermihe the whole methodology of research programmes. 
Yet he appears not to take his own criticism seriously, as he says elsewhere: 


Naturally, one must deal alse with the question of what progress is. In the 
framework of the theory of growth of knowledge, there is a heavy reliance on 
Lakatos’s theory of scientific research programmes. The concept is clear: ‘pro- 
gressive problem shift’ * 

Let us proceed to the substance of Elkana’s argument, to his criticism of 
‘chance’ discoveries and priority of empirical over the theoretical growth. 
As to chance discoveries, Elkana simply denies their existence: 


Let me remark that Lakatos himself distinguishes between chance discoveries 
in the ‘objective’ and the ‘subjective’ sense. In my opinion there is certainly no 
such thing as a chance discovery in the ‘objective’ sense—each experiment is 
either a confirming or refuting instance of some theory.’ 


It is, of course true that each experiment may, in a sense, be a confirming or 
refuting instance of some theory, for it is always possible to connect an 
experimental result with a number of theories that either already exist or 
we may formulate them ex post. Lakatos’s concept of ‘chance discoveries’ 
does not preclude this possiblity. Let me quote him on this point: 


An experimental discovery is a chance discovery in the objective sense if it is 
neither a confirming nor a refuting instance of some theory in the objective 
body of knowledge of the time.4 


This simply means that as a result of experiment we may come across a 
phenomenon that cannot be retrodicted on the basis of any scientific 
theory known at the time of the experiment (and neither could its negation). 


1 One may, of course, criticise Lakatos’s criteria or one may argue that it is impossible to 
set up universally applicable standards for the evaluation of competing theoretical 
systems (as, for example, Feyerabend does). However, to claim that the criteria of this 
type are redundant is to misunderstand one of the most important tasks of the philosophy 
of science. (For a lucid exposition of the importance of this problem see Lakatos [1974a].) 

* Elkana [1972], section 4, p. 2. 

3 Elkana [1974], p. 248. ° ° 

t Lakatos [1971], p. 204, n. 24. 
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A scientist may, for example, want to test a prediction concerning a position 
of some celestial body. Focusing the telescope on the relevant region he 
may notice that there is yet another body that has not been obsérved 
before—perhaps, because the telescope had pever been focused on this 
region. I cannot see any reason why claiming thig to be a case of chance 
discovery should be considered as conflation with classical inductivism, 
as Elkana suggests. Experiments are, of course, set up for purposes of 
confirmation or refutation of some theory, but to claim that every experi- 
mental outcome or every observation is of this nature would amount to 
saying that there is no possible state of affairs which has not been already 
accounted for (or specifically prohibited) by sorne theory known at the 
time of its performance. I don’t think Elkana would subscribe to such 
claim. 7 

Lakatos’s theory of rational reconstruction and his distinction between 
internal and external history is based, as he himself points out,! on Popper’s 
conception of the ‘third world’. This distinction calls for recognition of the 
fundamental difference between the logical and the psychological aspect 
of thoughts; between thought-contents and thought-processes. It enables 
Lakatos to account for the fact that logical relations between theories and 
problem-situations are not always perfectly reflected by the individual 
or the collective awareness of scientists. There is nothing unnatural about 
his insistence on the priority of internal explanations supplemented by 
external explanation of the discrepancy between the rational reconstruction 
and the actual narrative. I think, he only explicates what historians often 
do anyway.” Elkana, as I have already mentioned, considers this distinction 
superfluous and detrimental to historiography. He believes that it prevents 
an historian from seeing (or telling) the true story. In my opinion, there is 
no reason why it should. This danger seems rather more immanent in the 
opposite approach suggested by Elkana himself. In the context of arguing 
against the internal external dichotomy he claims that 


If our view on the growth of science is correct, that is if scientific growth is 
explainable by competing scientific research programmes then it is also true that 
scientists are always aware of competing research programmes, and when they 
are planning an experiment they always wish to confirm some of their predictions 
or to refute some of the rival’s predictions.§ 


1 Ibid., p. 205. 

2 Assume that the hypothesis H follows from the theoretical system T. If we want to 
explain why a scientist who accepts the system T also believes in H, historians usually 
don’t invoke external factors but simply point out the logical connection between T 
and Æ. When this (or some other) logical connection doesn’t correspond to the associated 
beliefa then they offer further external ‘explanation. 

Elkana [1974], p. 248, his italics. 
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If this be taken seriously, Elkana has to deny that there were ever creative 
scientists who were not aware of competing research programmes (in the 
sense that the concept is employed by him and by Lakatos). This seems 
quite difficult, for their certajnly are scientists who contribute to the growth 
of knowledge without þeing aware of their own research programme, let 
alone the rival ones. For quite a few of them it may require some effort to 
grasp Lakatos’s or Elkana’s explanation of what constitutes a research 
programme and to structure their knowledge oftheir field into such a 
form. This becomes even more apparent when we look back in history. 
Reconstructing history with hindsight, we may always find competing 
research programmes if we look for them, this however does not mean 
that scientists and philosophers were actually aware of them. Elkana 
further claims that when sgientists are planning an experiment they 
always do so with the intention to confirm their own predictions or to 
refute some of their rivals. The explicability of the growth of scientific 
knowledge in terms of research programmes is also made dependent on 
this. But again, this claim can hardly be sustained as a matter of fact, and 
as a matter of methodological principle it is not helpful.+ If we push this 
principle a little further, we would have to say something like this? to 
present a certain period in history as one of theoretical stagnation is 
correct if and only if the scientists concerned were actually aware of the 
degeneration of their research programme; or conversely: if they were 
convinced that they were achieving progress, then the only correct re- 
construction of that period should be that of scientific progress. 

The requirement that our explanatory theory should be in agreement 
with the actual thinking of scientists leads to the two equally unacceptable 
options: either one has to show that scientists always based their decisions 
on the same methodology that we use in the historical reconstruction, or 
else one has to alter the methodological and conceptual framework of our 
reconstruction each time that the ‘image of science’ changed. 

Another problem raised by these considerations is the question: ‘What 
are the relevant factors in explanation of the growth of scientific know- 
ledge?’ This is where the answers of the two philosophers differ most 
radically. Lakatos suggests that only ‘internal’ factors are relevant to the 
problem of growth. He claims that ‘in view of the autonomy of internal 


1 It is, of course, quite natural that scientists should wish to refute rival theories and con- 
firm their own. However, there are cases where the opposite is true. One does not have 
to invoke masochism to explain them. It may be perfectly rational for a scientist to 
wish for refutation of his own theory—he may have good reasons for this: e.g., his theory, 
if true, would have undesirable social consequences; it may go against his ideological or 
religious beliefs, or it may simply contradict séme otherscientific theories to which he 
is deeply committed. 


\ 

Xo : 
(but not external) history, external history is irrelevant for the under- 
standing.of science.1 As against this, Elkana maintains that part of those 
factors that Lakatos calls ‘external’ also have to be taken into copsideration, 
He offers the following argument: 
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I fully agree [with Lakatos] that whether ‘an experimtnt is crucial or not, 
whether a hypothesis is highly probable . . . whether a problem-shift i is pro- 
gressive or not is not dependent i in the slightest on the scientist’s beliefs, per- 
sonality or authority’, but it is dependent on what, according to the scientist is 
the role of science and of the scientist; on what, according to him, are the basic 
concepts, theorizing in terms of which is legitimate ‘scientific thought; on 
* what he amok 3 is the accepted limit of speculation in order to get advance- 
* ment. 


This argument is, however, untenable. For # one fully agrees that such 
things as probability of a hypothesis or progressiveness of a research 
programme do not depend on the scientist’s beliefs, how can they, at the 
same time, depend upon ‘what, according to the scientist is the role of 
science . . . on what, according to him are the basic concepts . . .’, and 
thus be in fact dependent on the scientist’s beliefs? Some sort of justifi- 
“cation for this is offered on the next page: 


These are rational considerations. These cognitive considerations I shall 
cali the image of science: it is the sum total of thoughts on what science is and 
should be and it has a major rational influence on the scientific programme of 
individuals, schools, communities. If this be psychology, it is cognitive psycho- 
logy not motivational. 


But surely, the fact that these considerations may legitimately be viewed 
as rational does not make them any more relevant to the issue. If we agree 
that a theory is either true or false irrespectively of anybody’s beliefs, it 
seems strange to divide these beliefs into two groups, claiming that those 
for which one has rational reasons aré relevant while those that are moti- 
vated otherwise are irrelevant to the truth or falsity of a theory. The 
same applies to the criteria of progress defined in objectivist terms. Whether 
a theory T) has a greater logical content than a theory T, is essentially 
independent of any beliefs and opinions held by scientists. The claim that 
it is indeed independent of beliefs called motivational yet dependent on 
other beliefs labelled as rational hardly makes sense. 

This confusion springs out of the conflation of two different aspects of 
the problem ‘how does the knowledge grow?’. It arises from a failure to 
separate the normative-epistemological aspect from the empirico- 
historical aspect of the question. Elkana refers to attempts to separate the 


1 Lakatos [1971], p. 196. 2 Elkana [1974], p. 245. 3 Ibid., p. 246. 
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normative from the descriptive side as ‘old hat’, yet it seems to be this 
very issue which is the source of the logical difficulties of his position. 
The normative problem (to which Lakatos proposed a solution) can be 
restated as follows: What cpnstitutes growth as opposed to mere change? 
Can we formulate general criteria for what is a progressive problem-shift? 
The empirical problem of growth can be reformulated as follows: How 
did scientific ideas develop? What is their origin and what are the factors 
that influenced their development? When we deal with the first type of 
problem, e.g., when we want to explain why Copernican theory constituted 
progress over the Ptolemaic system, we do not have to investigate what , 
influenced the scientists in their choice of problems or in the process of . 
formulation of their theories. Also the reasons why the Copernican theory ° 
was finally preferred are not directly relevant to this issue. The question 
refers to a historical situation, yet it is a normative and not a historical 
problem. Elkana, is primarily interested in ‘what causes change in the 
content of knowledge, and what is that part which grows, t.e., serves as a 
nucleus of accumulation and continuity’, that is, in the second type of 
problem. One of his main theses (supported by interesting case studies) 
is that the development of knowledge is best understood as a result of” 
mutual influences between theories about nature and (mainly normative) 
theories about science (which he terms ‘the image of science’). Once we 
distinguish between the normative problem: why is one theory better 
than another?, and the historical problem: why was one theory accepted 
rather than another?, it becomes apparent that different types of answer 
have to be offered for each of them. The answer to the normative question 
is to be sought within the realm of logic and methodology since we are 
concerned here with products of scientists’ creativity and deliberations 
and not with the genesis of their thoughts. When dealing with the second 
type of problem it is clear that external aspects (concerning influences) 
are just as relevant as internal aspects (concerning problem-situations). 
What influenced scientists in their deliberations is an empirical problem 
and one cannot make strict a priori distinctions as to what type of factors 
are relevant qua influences. 

It would seem that the major part of the controversy between Lakatos 
and Elkana can be resolved simply by pointing out that each is dealing 
with a different problem; that since Lakatos is concerned with a normative 
aspect and Elkana with a historical one, their views do not really clash, 
In this way we would, however, miss the most interesting aspect of their 
differences. For there remains a genuine disagreement as to what is and 
what is not relevant for understanding, science and its history. Prima facie, 
one would. have to side with Elkana, for if we want to explain how ideas 
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developed we surely cannot ignore the social context in which this develop- 
ment took place. Although this does not constitute an argument that the 
internal-external distinction is untenable, one could wonder what is the 
point of drawing it. However, when we ask the question: What is the 
relevance of history for science-teaching (which Elkana claims to be his 
prime interest), the situation looks different. Elkana subscribes to Lord 
Bolingbroke’s metaphor: ‘History is philosophy teaching by examples’. 
Leaving the ‘inductivist’ connotations aside, the question is: what is the 
lesson that history is supposed to teach us? It seems that besides the some- 
what obvious conclusion that the development of ideas is also influenced 
by socio-psychological and other external factors, the area that may 
- increase our understanding of science and its development is precisely its 
internal history. ` 
Elkana does not explain what we are supposed to learn from external 
history, f.e., what the lesson is that can be drawn from the fact that ideas 
are influenced, co-determined or even determined by socio-psychological 
and other circumstances. The reason for this omission is not hard to find, 
for there seems to be no such lesson. Let us take an extreme example. 
» Suppose that the water in Koenigsberg had such a chemical composition 
that in combination with the food that Kant regularly ate it affected the 
functioning of his brain. Let us further assume that unless this process 
took place, Kant wouldn’t have produced his Critiques. Now, it would be 
true, in a sense, that if we want to understand why Kant has written what 
he wrote we would have to take the peculiarities of the water into con- 
sideration. Pointing to this fact could actually constitute some şort of 
explanation. However, in another (and much more interesting) sense we 
can understand why Kant draws conclusions about synthetic a priort 
judgments even without considering this causal process. It may be 
objected that my extreme example concerns the interaction between physical 
and theoretical structures, while Dr Elkana stresses the relevance of the 
interaction between the theoretical and the socio-psychological structures. 
However, the argument remains just the same whatever causal factors we 
deal with. What do we learn about science from the explanation that 
certain problems were chosen because they were considered socially more 
significant? What do we learn about science from the fact that the 
choice of a research programme has been decisively influenced by the 
prevailing image of science? Even though it may be true that without 
these influences the science of today would be quite different, our under- 
standirig of scientific and methodological problems would hardly increase 
with the discovery of these,influences. Whether it was a belief in a signifi- 
cance of the problem or the legendary apple which allegedly fell on 
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Newton’s head that gave him the incentive to work on his gravitation theory, 
the knowledge of these factors does not improve our understanding of 
this’theory. Moreover, such considerations are completely irrelevant to 
the question: why is Newton’s theory better than that of Galileo? a 
question from which, J think, we may learn something important about 
science and its progress. f 

While rejecting the internal-external distinction, Elkana stresses the 
need for analysing those factors that constitute a major rational influence 
on scientists. He does not, however, tell us in what sense he uses the word 
‘rational’ here. My guess is that the rational influences which he recom- 
mends us to consider as relevant for understanding science and its growth 
are precisely those that can be internally reconstructed with reference to . ` 
the underlying theoretical structures, with reference to logical relations 
between thought-contents and problem-situations. 

It appears that when in historical explanations we appeal to the beliefs 
of scientists and to the influences they were exposed to, our understanding 
of science (and even of the thinking of scientists themselves) increases 
primarily (if not exclusively) in those cases where the beliefs and influences 
point to the corresponding logical connections between ideas. By the same 
token, the understanding of the history of science (unless taken as memor- 
ising a chronology of events), has to be connected with the understanding 
of interrelations between theoretical structures. History of ideas which is 
not based on some internal reconstruction could only produce a collection 
of causally related (or altogether unrelated) curiosities. 

One,of the standard objections to the internalist approach to science is 
that since this approach deals with the products of scientific activity rather 
than with the activity itself, it can only present a static picture of science. 
Since science is permanently in flux, one can not capture its characteristic 
features in terms of petrified theoretical structures. But, for all its suggest- 
iveness, it remains unclear what is the logic behind such arguments. There 
is no reason why explanations in terms of interaction between theoretical 
entities could not be used for explanation of dynamic development. The 
demand for special (sometimes called ‘dialectical’) explanations of dynamic 
processes by suitable concepts seems to suggest that a theory of change 
should be itself constantly changing or that a theory of electro-magnetism 
should itself have electro-magnetic properties. 


So far I have argued in favour of rational reconstruction, trying to defend 
the ‘internal-external’ dichotomy against Elkana’s criticism. I shall now 
try to show that Lakatos uses this dichotomy in two different ways which 
creates difficulties for his methodology. I shall try to show that the 
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meaning-shift and the resulting problems are related to his attempt to 
construtt a general method for appraisal of methodologies with the help 
of the history of science. I shall critically examine Lakatos’s theory of 
‘historical test? which sees the history of sqjence as the final arbiter in 
methodological disputes. At the same time I will*express some reservations 
concerning Lakatos’s formulation of the criterion of progress. 

The distinction between internal and external history (where internal 
history is understood as an objective, ‘third world’ analogue of the thought- 
processes of scientists) is very important for the whole of Lakatos’s 
philosophy of science. It is mainly by emphasis on internal, rationally 
reconstructed history that Lakatos differentiated his philosophical attitude 
from Kuhn’s: 


... the—rationally reconstructed—growth of science takes place essentially 
in the world of ideas, in Plato’s and Popper’s third world, in the world of 
articulated knowledge which is independent of knowing subjects. Popper’s 
[and Lakatos’] research programme aims at description of the objective scientific 
growth, Kuhn’s research programme seems to aim at a description of change in 
scientific mind .. .1 

... My concept of a ‘research programme’ may be construed as an objective 
‘third world’ reconstruction of Kuhn’s socio-psychological concept of paradigm.? 
One of the main features of Lakatos’s philosophy of science is that 
methodology should study the products of the cognitive activity of scientists, 
rather than the activity itself (which is the subject matter of psychology). 
Lakatos draws his distinction between the internal and the external 
history to uphold this ‘objectivist’? approach, to keep the ‘logic’of dis- 
covery’ separated from the ‘psychology of research’.* Turning to history, 
it is clear that logical relations between ideas are not perfectly mirrored in 
the consciousness of scientists. Lakatos suggests that we ‘indicate dis- 
crepancies between history and its’rational reconstruction [by relating] 
the internal history in the text, and indicate in the footnotes how actual 
history “misbehaved” in the light of its rational reconstruction’.* Following 
this conception we can obtain two histories that may coincide or diverge. 
There is no conflict between them, they can be regarded as parallel to 
one another. Even when internal history significantly differs from external, 
they can be both true: two theories may be objectively incompatible and 
yet be both accepted by the same scientific community. Under this inter- 
pretation (where internal history describes the logical relations between 


1 Lakatos [1970], p. 180. 2 Ibid., p. 179. 

3 See Lakatos’s contribution to the discussion of Agassi [1974] and Mendelsobn [1974], 
PP. 427-31, or Lakatos’s reply to Kuhr in Buck and Cohen (eds.) [1971], pp. 174-5. 

* Lakatos [1971], p. 217. 
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ideas while the external history refers to psychological factors) Lakatos’s 
preference for those historical accounts that explain most of the actual 
history ‘internally’ can be well understood. However, in the context of 
the historical test of methodologies the ‘internal-external’ distinction 
undergoes a substantiql ehange in meaning. It ceases to function as a 
distinction between the third world theoretical structures and the second 
world thought-processes; for as such internal history should be the same 
irrespectively of the methodology which guided the rational reconstruction. 
Lakatos does not only want to evaluate particular historical reconstructions 
with respect to the proportion of the internal over the external explana- 
tions, he wants to show that the extent to which events can be explained . 
‘internally’ depends on a given methodology. Methodologies are to be ° 
appraised according to how puch of the actual history they can explain 
internally. 

External explanations are thus considered as a sign of a poor methodo- 
logy. It seems to me that Lakatos goes even further. Although he himself 
states that ‘radical internalism is utopian’, he writes in a disapproving 
manner of methodologies which lead to internal histories that are 
compatible with different external explanations. In this vein he writes 
about ‘Inductivism’; the rational reconstruction to which it allegedly 
leads ‘is compatible with many different supplementary empirical or 
external theories of problem choice’. Lakatos creates the impression that 
it is a vice that a rational reconstruction should be equally ‘compatible 
with the vilgar-Marxist view that problem choice is determined by social 
needs’ and with another ‘external theory that the choice of problems is 
primarily determined by inborn, or arbitrarily chosen (or traditional) 
theoretical (or ‘metaphysical’) frameworks’! Similar remarks of dis- 
approval are made against ‘conventionalism’? and against Popper’s 
‘falsificationism’.? However, beside, the difficulty of reconciling this 
application of the internal-external dichotomy with its original meaning, 
it is also hard to see what is the force of Lakatos’s criticism. Why should 
compatibility with different empirical theories of problem choice be 
viewed as a sign of a bad methodology? The choice of problems within 
the research programme that was carried out during the World War II 
at Los Alamos was surely very much determined by extra-scientific 
‘needs’, Any reconstruction which is incompatible with the actual empirical 
explanation of the problem choice should, I think, be condemned as an 
example of bad apriorism, rather than praised for its explanatory power. 
Incompatibility of internal reconstruction with various types of external 


1 Ibid., p. 199. g * 2 Ibid., p. 202. 
3 Ibid., pp. 204-5. 
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(empirical) explanations should be regarded as a vice rather than as a 
virtue for the simple reason that one of the possible external ac 
is the true one. . 

One can sympathise with Lakatos’s contention that historians -should 
try to explain as much of the history as possible, internally—that is with 
reference to the content of ideas and to their logical inter-relations. 
However, if internal history is to be at all associated with theoretical 
structures and contrasted with external empirical history, then the ratio 
of the internal to external explanations cannot be used as a test of rival 
methodologies. For internal history, so conceived, should always be com- 
patible with any number of external empirical explanations. The point is, 
which internal reconstruction is the most useful for the understanding of 
science and which external reconstruction is the closest to what has 
actually happened? 

Let me now turn to the proposed historical test of methodologies. 
Lakatos’s central idea is very attractive. It suggests that one does not 
have to use laborious and often indecisive logical and epistemological 
arguments to disqualify rival methodologies. Lakatos proposes that it 
would suffice to check the performance of a given methodology when its 
implied theory of rationality is turned into a hard core of a historio- 
graphical research programme. He maintains that ‘methodologies may be 
criticised without any direct reference to any epistemological (or even 
logical) theory, and without using directly any logico-epistemological 
criticism’ and he presents a ‘general theory of how to compare rival 
logics of discovery in which . . . history may be seen as a “test” of its 
rational reconstruction’. In onde to show the applicability of this his- 
torical test to different methodologies, Lakatos first argues that each 
methodology implies normative guidance for historiography, for rational 
reconstruction. ‘The basic idea of this criticism is that all methodologies 
function as historiographical (or meta-historical) theories (or research pro- 
grammes) and can be criticized by criticizing the rational reconstruction to 
which they lead.’ The content of the rational reconstruction (which now 
differs from methodology to methodology) is to be contrasted with the 
actual history. But what is this actual history? Lakatos not only assures us 
that ‘history without some theoretical “bias” is impossible’.? He makes 
a much stronger claim when he identifies this bias with the normative 
import of methodologies, that is, with that which is supposed to be tested. 
On his account ‘historiographical “factual” propositions are also theory- 
laden: the theories involved are methodological theories”? But if this be 


1 Ibid., p. 220. 2'Tbid., p. 215, n. 60. 3 Ibid. 
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the case, what could we learn from such a historical test? How could a 
history serve as a test of the methodology on the basis of which it has been 
recorfstructed? A historical test so conceived has a very strong circular 
element. For it is obvious that history interpreted according to a specific 
methodological criteria weuld always tend to vindicate that rational re- 
construction which.is explicitly based on this methodology rather than 
any rival one. Lakatos reminds us that ‘internal history is not just a 
selection of methodologically interpreted facts: it may be, on occasions, 
their radically improved version’.1 This provision would admittedly enable 
us to contrast the ‘pre-reconstructed’ history with its improved rational 
reconstruction, but it doés not really solve the problem of circularity. We 
would still compare two reconstructions pregnant with the same methodo- 
logical bias.* A 

The underlying idea of Lakatos’s method for evaluating rival methodo- 
logies is, prima facie, plausible and intuitively clear: each methodology 
implies a rationality theory which provides the rules for acceptance and 
rejection of theories. These serve as normative guidance for the rational 
reconstruction. Now, if the normative standards implied by the methodo- 
logy do not acknowledge the practice of scientists as ‘rational’ the methodo- 
logy is no good. After all, science is the paradigm of rationality. In another 
words, the method of historical test requires that a methodology should 
be vindicated (as much as possible) by scientific practice. 

Undoubtedly, a lot can be said in favour of this requirement. However, 
the connection between the facts of history and the merits of a methodo- 
logy remains essentially unclear; and the impact of historical criticism on 
methodological rules remains unclear too.’ If one can show that the 
Copernican theory was not in fact preferred to the Ptolemaic system for 
reasons of its alleged greater simplicity; then this is a blow for the his- 
torians that make a contrary claim. However, what bearing does this fact 
itself have on a methodology which implies a preference for simpler 
systems? Lakatos brings examples of alleged ‘crucial’ experiments and 
shows that, as a matter of fact, research programmes were never aban- 
doned as a result of any single negative test. But (assuming that Lakatos 
is historically correct) should we therefore conclude that there is no point 
in trying to set up crucial experiments, or that the methodology which 


1 Ibid., p. 216. 

2 In his [1976] John Worrall makes some important clarifications of the idea of ‘historical 
test’ showing that it need not be circular. 

3 John Worrall further elaborates Lakatos’s method by supplementing it with an additional 
principle which helps to bridge the logical gap between the descriptive nature of historical 
evidence and the normative standards of a methodology. See his [1976], especially 
section 5. 


Comments on Elkana and Lakatos 339 


recommends them is therefore inferior to the one that does not? Although 
science-is surely the best example of rational activity, it still is not clear 
how we could infer from the facts of history of science to the merits of a 
particular methodology. The problem is how to select the historical 
evidence which is to be compared with our * methodology. Obviously we 
cannot demand that all judgments of scientists should agree with our 
methodological principles. On the other hand we also cannot just select 
only.those judgments and decisions that are seen as correct in the light of 
our methodological rules, for then the test would clearly be tautological. 
Another possibility is to find the vindication of our methodology in its 
agreement either with the ‘basic judgments’ of’ most scientists, or with 
those of the best scientists. The difficulty of the first approach lies in its 
unqualified quantitative character. There is no reason to suppose that the 
opinion of majority is better than the contrary view of the outstanding 
few. Indeed, it would be a strange victory if our methodology happened 
to be vindicated by the majority of the casual scientific practice while 
that of the scientific élite would fall outside its scope. Lakatos opts for 
the second approach. Methodology is contrasted with the ‘best science’, 
with the basic judgments of the best scientists or with those decisions 
that clearly led to progress. This approach does not need to rely on the 
legislation of the methodology under test as to what are the ‘best gambits’ 
in science. One can appeal to the consensus of the scientific community. 
Lakatos point out that there is such a consensus. He concludes that ‘a 
general definition of science thus must reconstruct the allegedly best 
gambits as ‘scientific’: if it fails to do so, it has to be rejected’.? Lakatos’s 
chief concern is that a methodology should not be at odds with progress. 
A good methodology must not be in conflict with those decisions of 
scientists which led to an undisputed progressive problem shift. 

This is, of course, quite a reasonable requirement. However, Lakatos’s 
method of appraisal of rival methodologies with regard to how many steps 
leading to progress can be explained in accordance with its normative rules 
cannot be taken at its face value. It rests on the assumption that correct 
methodological decisions lead to progress. The assumption is, of course, 


1 Lakatos [1971], p. 222: ‘While there has been little agreement concerning a universal 
criteria of the scientific character of theories, there has been considerable agreement 
over the last two centuries concerning single achievements. While there has been no 
general agreement concerning a theory of scientific Tationality there has been consider- 
able agreement concerning whether a particular step in the game was scientific or crankish, 
or whether a particular gambit was played correctly or not.’ 

2 Ibid., p. 222; Lakatos later modifies this requirement so that methodologies conflicting 
with the ‘best garobits’ are not rejected outright. Agreement or disagreement with the 
‘basic value judgement’ of séientists ntvertheless remains as the basis for appraisal of 
rival methodologies. 
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trivially true when pronounced with hindsight. However, unless understood 
as analytic, it can hardly be accepted without reservations. In order to 
accept it, we would have to assume that the path of progress is unique— 
i.e., that progress could have been achieved in no other way but the way 
it has been actually achieved. We have no good reasons to suppose that 
this assumption is gither true or useful. Moreover, if we want to extract 
any normative support for our methodology from those steps is science 
which led to progress ‘we would further have to assume (with Leibniz) 
that the actual progress of science was the best of all possible progresses. 

Let us look at some examples that are supposed to vindicate Lakatos’s 
philosophy of science vis-à-vis, say, that of Karl Popper. Lakatos’s methodo- 
logy gains vindication from every instance of a recorded counter-example 
{or anomaly) to the progressing research programme which was ignored 
by the protagonists of that research programme. At the same time, such 
instances are to detract from the credibility of Popper’s methodology 
which implies that attempts to solve these problems constitute a major 
factor in the growth of scientific knowledge. However, in order to accept 
this verdict it would first have to be shown that in all (or most) such 
instances attempts to solve the problems resulting from the counter- 
examples (or anomalies) would lead to stagnation. Only then, could the. 
fact that progress was achieved despite known counter-examples have a 
direct bearing on the respective merits of the two methodologies. The 
same applies to inconsistencies. Lakatos makes a point that a research 
programme can progress even on inconsistent foundations. (He mentions 
the early development of the infinitesimal calculus and naive set theory as 
examples.) But, in order to derive from these facts support for a methodo- 
logy which allows inconsistencies or recommends one to ignore them as 
long as the research programme is progressing, one would have to show 
that in most cases attempts to rectify inconsistencies would have had 
produced (in the long run) results inferior to those that have actually been 
achieved. 

These considerations bring me to another problem of Lakatos’s his- 
torical test. His method gives high marks to those methodologies that can 
explain most of the history of scientific progress within ‘the legislative 
domain of its normative rules’. But we learn from the case studies of 
Lakatos, Elkana, Feyerabend and others that progress in science has been 
achieved in a variety of different and often conflicting ways. We learn 
that a research programme can advance if anomalies or inconsistencies 
are ignored, but it can also advance if these are solved or acted upor. The 
choice of a simpler theory may lead to an advancement but so can a decision 
in favour of a more complicated system. Given the multitude of contrary 
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evaluations and principles that all sometimes led to progress a philosopher 
of science is confronted with the following dilemma: if he wants to pass 
Lakatos’s historical test with distinctions he either has to adjust the 
history to fit his methodological requirements, or to relax these require- 
ments so that they would clash with nothing,in which way they would 
lose its normative import. With the growing interest in the history of 
science the first approach has largely been discredited. Thus the remaining 
option is to adjust the methodological principles to suit the history. In 
this way the methodology (originally conceived as normative) becomes 
exclugively descriptive—its normative content evaporates. As a result we 
end up with a set of rules that are easily vindicated by almost any example 
from the history of science, however, it becomes increasingly unclear 
what these rules prescribe or forbid. £ 

It seems that Lakatos’s philosophy of science comes very close to this 
‘normative’ all-inclusiveness. After presenting an elaborate explication of 
progress in terms of progressive and degenerative problem shifts, his 
methodology tell us that it is rational to abandon a degenerating research 
programme in favour of a more fortunate competitor. But Lakatos’s 
methodology also tells us that it is equally rational to stick to a degenerating 
research programme since it can always stage a comeback. It is clearly not 
difficult to find historical examples for such a theory of rationality. It is 
more difficult to find in Lakatos’s methodology those ‘normative rules . . 
[for] (scientific) acceptance and rejection of theories or research pro- 
grammes . . . whose violation is intolerable’.t For how does vne violate a 
rule that sanctions two contrary methodological decisions based on the same 
assessment of a situation? 

Lakatos tries to neutralise the objections to the evaporation of the 
normative content of his methodology by claiming that his philosophy of 
science only provides a theory of appraisal without hinting at heuristic 
advice. However, beside the difficulty of reconciling this contention with 
an extensive part of Lakatos’s writings, it is not clear what is the point of 
‘appraisal’ if it does not have any heuristic impact.? 


Finally, let me add some critical remarks concerning Lakatos’s criterion 
of progress, which plays a crucial role in his methodology of research 
programmes. Lakatos distinguishes between theoretical and empirical 


1 Lakatos characterises different methodologies (including his own) ‘by rules governing 
the (scientific) acceptance and rejection of theories or research programmes. These rules 
have a double function. First they function as a code of scientific honesty whose violation 
is intolerable; secondly, as hard cores of (normative) historiographical research pro- 

i , bid, p. 197. 

2 For a discussion of this préblem see*Feyerabend [1970], Quinn [1972], Smart [1972] 

and Worrall [1976]. 
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growth of a research programme. Theoretical growth consists in the 
increase of predictive power of a research programme while empirical 
growth consists of increase of corroborations of the predictions. His 
definition runs as follows: 

A research programme is saitl to be progressing as long as its theoretical growth 
ancipates its empirical growth, that is as long as it keeps predicting novel facts 
with some guccess (“progressive problemshift’’); it is stagnating if its theoretical 
growth lags behind its empirical growth, that is as long as it gives only post hoc 
explanations either of chance discoveries or of facts anticipated by, and dis- 
covered in, a rival programme (“degenerating problemshift”).+ 

This definition becomes problematic when it is taken together with 
Lakatos’s tolerant attitude towards inconsistencies. As I have already 
mentioned, Lakatos allows that research programmes may progress even 
on inconsistent foundations. But since it is a fact of elementary logic that 
any statement whatsoever may be validly derived from an inconsistent 
system, any discovery could be validly predicted (or retrodicted). This 
means that the theoretical content would always exceed the empirical one. 
Thus any inconsistent research programme would have to be considered 
as ‘progressive’. Research programmes with inconsistent foundations are 
thus not only tolerated, but if progress is a desirable thing, they should 
actually be preferred on Lakatos’s definition of progress. In this way his 
methodology would place a high bonus on inconsistency. 

It may be objected that this difficulty is a purely formal one. Scientists 
would hardly rejoice in the discovery of inconsistency in their research 
programme and begin to derive novel predictions or the respectable 
theories of their rivals. The problem would not have to be taken seriously 
if it were not for Lakatos’s insistence that his ‘concept of a research 
programme may be constructed as an objective, “third world” reconstruc- 
tion of Kuhn’s socio-psychological concept of paradigm’, that his 
‘rationally reconstructed growth of science takes place essentially in the 
world of ideas, in Plato’s and Popper's “third world”, in the world of 
articulated knowledge which is independent of knowing subjects’.? 

Even if we leave inconsistencies aside and allow for a more pragmatic 
interpretation of Lakatos’s research programmes, his definition of progress 
could still lead to strongly counter-intuitive results. I shall illustrate my 
point by the following example. Let us consider two rival research pro- 
grammes (SRP, and SRP»). Assume that for some external reasons the 
protagonists of SRP, do not come up with any novel predictions.? On 
1 Lakatos [1971], p. 207. * Lakatos [1970], pp. 179~80. j 
5 E.g., the proponents of a research programme may be liquidated (the case of Mendelian 


genetics in the U.S.S.R.), research funds may dfy up, scientists may stop theorising and 
devote all their energies to commercial or industrial application of the previous results, etc. 
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the other hand, the scientists working within the framework of SRP» are 
continuously coming up with novel interesting predictions. Most of the 
test results of these predictions, however, turn to be in conflict with the 
theories and assumptions constituting the research programme. SRP». 
Let us further suppose that it so happens that, the test outcomes are not 
only consistent with fhe rival SRPa, but that they are actually derivable 
from the theories and assumptions of this research programme SRP, 
(without ad hoc adjustments). In such a situation, all predicting is done 
within the SRP», and since we have allowed for the corroboration of a 
few of the predictions, SRP» should be termed ‘progressive’ (with all the 
normative connotations that the term may carry). Yet, it is surely strange 
to prefer SRP» as ‘progressive’ since, intuitively speaking, the test results 
indicate that there is something essentially wrong with the research 
programme SRP» and something basically correct in the research pro- 
gramme SRP,; or, to use Popper’s terms: the truth-content of SRP» is 
growing smaller and its falsity-content increases relatively to SRP,. I 
think that for a knowledgeable historian of science it would not be difficult 
to find real examples that have the basic structure of this argument. 

“One may object that the issue of which programme does the predicting, 
is a third-world matter—t.e., a question of logical relations and not of 
which group of scientists happen to make some statement first. This was 
also Lakatos’s attitude. However, his definition of progress (as quoted 
above) heavily rests on the temporal factor. This goes together with 
Lakatos’s view of priority disputes in science, which he considers as part 
of internal history. But since the third-world is supposed to be time-less, 
the question arises whether his definition (as it stands) can be considered 
as being purely concerned with third-world matters. 

This, I believe, leaves Lakatos’s definition opened to two interpretations 
both of which are problematic. If we take a strictly third-world interpreta- 
tion, then it runs into the problem with inconsistencies. If we interpret it 
pragmatically, then the definition would make appraisal dependent on 
accidental external matters which Lakatos himself would consider ir- 
relevant. 

Jerusalem University 
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Discussion 


INCOMMENSURABILITY AND THE KABONALETY OF THE 
DEVELOPMENT OF SCIENCE 


The Thesis of Incommensurability (henceforth referred to as TI) says that two 
theories separated by a revolution may be altogether ihcommensurable. Accept- 
ance of TI'seems to undermine the possibility of representing revolutionary 
advances in science in a rational way. According to TI, such “advances” proceed 
in a disconnected way, so that it is not possible to find logical connections 
between the old theory and the new theory that replaced it. 

The aim of this note is to present two criticisms of TI. My first criticism 
will be that if TI were true it would apply not gnly to revolutionary periods but 
to all phases of theoretical change in science. Most defenders of TI would allow 
that, during periods of non-revolutionary or “normal” science, an improved 
theory T” is comparable with the earlier theory T from which it was developed; 
so that rational appraisal is possible i in these cases. I agree. But I claim that T’ 
and T will be “incommensurable” in these cases no less than in cases of revolu- 
tionary change: thus “incommensurability” either excludes rational comparison 
and appraisal in non-revolutionary cases, or allows rational comparison and 
appraisal in revolutionary cases. 

This criticism is based on a historical claim: both in revolutionary and non- 
revolutionary cases, at least in physics and chemistry, a typical pattern occurs 
again and again, namely, the theory T” that is subsequently accepted is in the 
relation of correspondence with its predecessor T. 

My second criticism also arises from this historical fact. It is this. The rela- 
tion of correspondence does indeed involve meaning variance without involving 
incommensurability, at least in the sense of TI: for T and T’ will be rationally 
comparable if they are in the relation of correspondence. Allowing that TI is 
correct concerning meaning variance, I will propose a “desemanticised” explica- 
tion of the relation of correspondence. This will enable us to say that the theories 
T and T’, considered as formal sentarices, are in a relation of correspondence 
even though they become “incommensurable” when semantical interpretations 
are put upon their extra-logical vocabularies (one in the language of T and the 
other in the language of 7’). 


(i) The major premise of TI. 

There are some important differences between the viewpoints of the two main 
proponents of incommensurability: Thomas Kuhn and Paul Feyerabend.* But 
I am going to assume that they both accept that two successive theories (or 
paradigms) separated by a revolution are incommensurable in a sense that 
implies that they are not rationally comparable: the two theories are like two 


1 It has frequently been pointed out that two incommensurable theories need not be i in- 
comparable, e.g. by Giedymin [1975]. 
3 Compare Feyerabend [1970] and Kuhn [x970a] and [19708]. 
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different languages which organise the world in essentially different ways. 
Moreover, no neutral language is available, or even possible, into which both 
theorjes could be translated. Thus one may say that the adherents of incom- 
mensurable theories live in different worlds: although many of the signs in the 
two théories (e.g., “mass”, “fogce”, “compound” etc.) may be identical, their 
meanings are essentially different. 


(i) First criticism of TI: incommensurable theories in non-revolutionary periods. 

I agree that even when the majority of signs in two successive theories are 
identical, their meaning may be significantly different. If such meaning variance 
implies incommensurability then the theories are incommensurable. I will now 
argue that “incommensurable” theories occur not only in revolutionary but 
equally in non-revolutionary periods. The language of science is constantly 
undergoing changes, as is everyday language. “Incommensurable’” theories 
appear throughout the history of science. There is no non-arbitrary way to 
distinguish the revolutionary and non-revolutionary periods of scientific develop- 
ment. They can only be separated in a more or less artificial way. 

I will illustrate this claim with some historical examples: 


The law of the conservation of energy essentially says: ‘In an isolated system 
the total amount of all kinds of energy remains constant.’ This law underwent 
some changes as a result of later developments in thermodynamics, which 
brought about important changes in the concept of energy. 


At the beginning of the nineteenth century the formulation of the law could 
be represented thus: 


L: in an isolated system E +E; = constant 


where E, denotes kinetic energy and is a function of v (velocity) and E, denotes 
potential energy, and is a function of h (height). 

After the discovery of the laws of thermodynamics, L was enlarged to include 
thermal energy. The new formulation of the law, L’, was then: 


L’: in an isolated system E,+-Egt+E, = constant 


where E, and E, are as before, and £E, denotes thermal energy, and is a function 
of t (temperature). 

The shift from L to E may thus be construed as involving a change in the 
meaning of “energy”, both in its intension and its extension. The intension of 
“energy” in L’, compared with its intension in L, has a new connotation: thermal 
energy. The extension of the term changes as well, on this construal. What are 
counter-instances to the law of the conservation of energy as formulated in L 
may be instances of the law as formulated in L’ (namely, cases where a change in 
kinetic plus potential energy is compensated by a change in entropy). 

This kind of enlarging of a law is usually regarded as a “normal”, non- 
revolutionary change. Nevertheless it involves concept-stretching, and hence 
meaning variance: in this sense L and L’ are “incommensurable”. 

One reason why it is difficult to demarcate between non-revolutionary and 
revolutionary science is that new ideas in science often have their greatest impact 
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only after a long period of time, producing changes later that aré independent of, 
and often contrary to, the intention of their creators. 

Let us consider the following examples: The idea of central forces was orfe of 
the most important within Newton’s research programme. According to Jt, all 
active forces are central. In the beginning of the nineteenth century Oersted 
conducted a famous experiment: he placed a magnefic needle in the centre of a 
circular conductor and the needle declined in the perpendicular plane when 
current was passed through the conductor. The scientific community was 
shocked. Not only had a close connection between eléctricity and magnetism 
appeared, but the force acting on the magnet was non-central. The classical 
research programme had received a severe blow. To fescue that programme 

. Maxwell introduced his electromagnetic field theory, His aim seemed to be 
achieved. He introduced a theoretical basis for non-central forces, which 

- however confined them to the electro-magnetic domain and established central 
forces in the mechanical domain. Then, contrayy to his intention, Maxwell’s 
theory gradually undermined the mechanistic world-view, especially its Lap- 
lacean determinism. The predictability of future states of closed systems became 
problematic since a field is continuous, to predict a future state a Laplacean 

e Demon would require an infinity of initial conditions. f 

A second example is the statistical interpretation of thermodynamics. Ludwig 

Boltzmann was a passionate adherent of classical physics. His confessed aim? 
- was to maintain and develop classical mechanics (CM), which served mankind 
so wonderfully and so usefully for so long. His problem was this. Clausius’s 
second law of thermodynamics was asymmetrical with respect to time, in startling 
contrast with the reversibility of all other Jaws of classical physics. (One may 
mention Clausius’s apocalyptic prognosis about the thermodynamical death of 
the Universe, or the neo-Thomist entropological proof of the existence of God.) 

To rescue the hard core of the mechanistic programme Boltzmann introduced 
a statistical approach which stripped the second law of thermodynamics of its 
irreversibility. On his interpretation this was achieved by means of reduction: 
the macrolevel parameters were reduced to microlevel parameters. He then 
introduced a statistical assumption in connection with the ergodic postulate. 
Thanks to these efforts the second law of thermodynamics, in its statistical 
interpretation, lost the feature of irreversibility. The shift from a state of lower 
entropy to a state of higher entropy, though highly probable is no longer certain, 
and a reverse shift is possible. Indeed, as the period increases indefinitely, the 
probability of a decrease of entropy approaches arbitrarily close to one. 

Thanks to Boltzmann’s brilliant idea the classical programme had been saved. 
But only for the moment. Contrary to Boltzmann’s intention, his statistical 
approach gradually undermined the Newtonian programme: the laws of physics 
gradually became more and more statistical. The old ideal of an exceptionless 
law was gone. 

These two examples show that even the most eminent scientists may not 
know whether their own ideas are revolutionary or not. This may be known only 
in retrospect. A development which seems to belong within “normal” science 
may turn out to have revolutionary implications. This reinforces my earlier 
claim that “incommensurability” is not limited to revolutionary periods. 


1 See his [1905]. 
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(ti) The principle of correspondence in the history of science. 

As J said an important role in the history of science is played by some principle 
of correspondence. This serves as a bridge between‘successive theories, both in 
non-revolutionary and revolutionary advances. Indeed many eminent scientists 
made it a main heuristic requirement of their research programmes that a new 
theory be in some „relation of correspondence with the theory it supersedes. 
Many examples may be quoted, both of such a principle being exemplified in 
scientific developments and of its being prescribed by scientists. 

Examples of the former are the relation between Van der Waal’s law and 
Clapeyron’s law, and the relation between Maxwell-Boltzmann kinetic theory 
and the laws of Gay-Lussaç and Boyle—Mariotte. 

An example of the latter is Albert Einstein who incorporated a correspondence ` 
principle in his research programme as an important requirement: that a new ` 
relativistic theory (or law) should yield some classical theory (or law) as a limiting 
case. Einstein succeeded in satisfying this version of the principle, both in the 
case of his Special Relativity with respect to the law of inertia, Maxwell’s equa- 
tions, Newton’s second law and the laws of the conservation of energy and 
momentum; and again in the case of General Relativity, where Poisson’s equation 
turned out to be a limiting case of the law of gravitation.t A version of the 
correspondence principle was; of course, explicitly used by Niels Bohr. 


° 


(tv) Second criticism: a desemanticised relation of correspondence between 
incommensurable theories. 


How might an intransigent defender of TI respond to the role which the 
principle of correspondence has undoubtedly played in the history of science? 

When two “incommensurable” theories Tand T” are said by working scientists 
to be in the relation of correspondence, this implies that T and T” stand in certain 
logical. relations to each other. Thus if TJ is not to be regarded as historically 
refuted by the recurrence of this pattern, the incommensurabilist must deny 
that this relation did really hold between, say, the special theory of relativity 
and classical mechanics. For instance the incommensurabilist may say that men 
who are brilliant scientists are often poor methodologists: they believe that the 
relation of correspondence holds between incommensurable theories only 
because of ‘‘false consciousness”. He will say that if J and T” are incommensur- 
able, then it is impossible that logical relations hold between them because all 
the terms they have in common have essentially different meanings in T and 
in T”. Hence T and T” are not rationally comparable. 

In opposition to this claim I now want first to explain that logical relations 
may hold between two theories despite their incommensurability; and secondly 
to show that these logical relations make possible a rational appraisal of the 
comparative merits of the two theories. In other words, I shall try to show that 
Einstein’s requirement that a new relativistic law should yield the corresponding 
classical one as a limiting case, was not due to “‘false consciousness”. 

For this purpose I will introduce a short, semi-formal analysis of RC (as I 
will henceforth refer to this relation of correspondence). 

Let T be an axiomatisable theory expressed in the language L and T’ be an 


1 See for example: Zahar [1973]. 
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axiomatisable theory expressed in the language L’ and M be a meta-language 
for both L and L’. Assume that T determines the numerical function F(x, .. . x) 
such that all other numerical functions determined by T are mathematically 
derivable from F; er assume that 7” likewise anneal an anes pe numerical 
function F” (x... «+ Ya) Where x. » xt. , and yi... Yy are 
terms containing oe In the Beas M ese terms are treated as 
uninterpreted. 

We now lay down the following as a sufficient condition for the relation 
RC(T", T) to hold: it is possible in the meta-language M to effect a pairing of 
the uninterpreted signs x,...%, of T with the uninterpreted signs æi... x, 
of T” such that: the limiting value of F’ equals F wlren each of the variables 
Yı- -Ynin T’ goes to a certain value which may be either zero, or infinity, or 
some constant. In symbols: 

If there is a numerical function F(*,...*,) in T and a numerical function 
F(x... hy Yı- - - Yn) in T” such that for all yr < i < m), either 


lim F’ = F 
yi> 
or 
lim F = F 
yi>0 
or 
lm F =F 
vi const. 


then RC(T’, T) 

We can now see that meaning variance need not render T and T’ rationally 
incomparable. We can avoid the difficulties created by meaning variance by 
interpreting the relation of correspondence in this “desemanticised” way; we 
compare the mathematical functions F’ and F independently of the semantics 
of T and of T”. Formally speaking, F’ and F are brought together by means of 
an arbitrary pairing of signs. Of course, a working scientist might very well be 
guided by his intuitive understanding of the mathematical formalisms of the 
two theories in deciding which signs in T to pair with which signs in T”. The 
point is however, that the present method requires only that a pairing be effected 
in a way which leads to the result that Fi >> F as y4 . . . Ya tend to certain values. 

After T and T”, treated as uninterpreted formalisms, have been brought into 
the relation of correspondence, they may both be interpreted according to the 
semantics of T”. 

I will now illustrate this method in connection with STR and CM. However, 
instead of attempting to show that STK as a whole is in the relation RC to CM 
I will only show that one representative fragment of STR is in the relation RC 
to a corresponding fragment of CM. I will denote the former fragment by T” 
and the latter by T. 

Let / denote length, let J) denote rest length, v velocity, and c the velocity of 
light. Then T’ says Kx) = F’(Ig(x), v(%)) = h4/1—(e/c)*. And T says U(x) = 
F(1,(#)) = 1,(x). We bring T’ and T into the relation RC in accordance with the 
above method by representing / and J, in T and I, l, and v in T” respectively by: 
Xis Hg, Xi, xf and y,. In our meta-language M we now pair the uninterpreted 
terms x, and x, with respectively, the terms xj and x3. Thus we now write T 
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as a numerical function F(x,, #,) and T’ as a numerical function F’(«{, x$, 91). 
The relation of correspondence holds if there is a constant k such that: 
beaku X, y) = F(x, xa). 
cok 
and seas is such a k, namely k =o. 

These two uninterprefed formulas can now both be interpreted according to 
the semantics of T”. ‘On that interpretation they say respectively: the length ofa 
body i is its'rest length only if the body is at rest; the length of a body is always 
its rest length. 

The relation of correspondence between theories in the empirical sciences is 
asymmetric. Some of the formulas of T” are sent into formulas of T, when certain 
conditions are fulfilled, but not vice versa. 


The history of science supplies us with quite a few examples of sequences of ° 


theories (or laws), each of which is to be found in RC with its predecessor. For 
example STR is in RC with OM; CM is in RC with Galileo’s law of free fall. 
Such sequences of theories are by no means accidental. On the contrary they 
are achieved consciously and intentionally by the working scientist with the 
help of such heuristic tools as the principle of correspondence. 

The possibility of stating a relation of correspondence between successive 
theories, even when they are incommensurable, shows that contrary to the 
adherents of TY, there exists a rationality and a kind of continuity in the develop- 
ment of science, the most wonderful enterprise of mankind. It is however a ` 
sophisticated kind of continuity far from the naive image provided by eighteenth- 
and nineteenth-century scientists and philosophers, and also far from the dis- 
continuous paradigms-shift picture o scientific development suggested by the 
incommensurabilists. 

IRENA SZUMILEWICZ 
Polish Academy of Sciences 
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Review Articles 


CHANGING PATTERNS OF RECONSTRUGTION* 


Purpose of Book. , 
Basic Concepts, Linguistic Version. 

Basic Concepts, ‘Objective’ Formulation. 

Kuhn Sneedified. 

Advantages and Disadvantages of the Statement View. 
Incommensurability and the Rationality of Theory Change. 
Miscellaneous and Conclusion. 


Iau bh bw 


I PURPOSE OF BOOK 


The second part of volume ii of Professor Stegmueller’s comprehensive and 
detailed treatise on the philosophy of science! contains a new approach to 
reconstruction followed by a discussion of Kuhn, Lakatos, and other devi- 
ationists. The discussion favours Kuhn. Kuhn, says Stegmueller, has developed 
a ‘new conception for the philosophy of science’ (X). The conception is not 
perfect in all respects, but it has advantages not shared by its competitors. It 
looks irrational, or makes science appear irrational only if it is judged by a 
widely held view of theories, the so-called statement-view: theories are (re- 
constructed as) statements, or classes of statements, their empirical content 
consists of classes of observation statements, their application of classes of 
observation statements together with suitable derivations, and so on. Having held 
the statement view in the past Stegmueller was once highly critical of Kuhn.* But, 
so he says now with candour, ‘in the past two years there occurred a small revo- 
lution in my mind. At a time when, dealing with [Kuhn’s] work, I was deeply 
involved in a mental crisis, I suddenly, in the middle of the night, saw the light 
and changed my paradigm of theory’ (X).° The new paradigm that forms the basis 
of the book may be explained as follows., 


* Review of W. Stegmüller [1973]: Probleme und Resultate der Wissenschaftstheorie, und 
Analytischen Philosophie. Band 2: Theorie und Erfahrung, Zweiter Halbband: Theorien- 
strukturen und Theoriendynamik, Berlin: Springer. 

1 Numbers in brackets refer to pages of this volume. Numbers in brackets preceded by a 
roman i refer to pages of the first part of vol. ii. There is now also an English translation 
by W. Wolbueter, The Structure and Dynamics of Theories, Springer, New York, 1976, 
but I have not seen it. 

A large part of Stegmueller’s book explains procedures and results found in Sneed 
[1971]. My review is therefore also in part a review of Sneed. 

The ‘Sneed-Kuhn Theory’ as it has been called has many followers in Germany and has 
an increasing number of followers in the U.S. It has been applied to a variety of subjects, 
including economics and literary theory. Cf. the thesis by Heide Gottner and Joachim 
Jacobs [1975]. 

2 Cf. his remarks in Stegmtiller [1971], esp. p. 27 ff. Stegmtiller now calls some of those 
remarks ‘unjust’ (p. 5). 

? The formulation is intentionally ironic&l. Stegmueller continues: ‘Such a description 
does of course not preclude my being also able to give reasons for the new view.’ 
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2 BASIC CONCEPTS, LINGUISTIC VERSION 


Theories are reconstructed as concepts, or properties and philosophical ideas 
are defined accordingly. Assume a theory has domains D,, Dy, .. ., Dg, non- 
theoretical functions f} . . . fa, theoretical functions F,,,,,... Fs, axioms ay, 

. ap and Ariy » - . Ag (the* a specifying properties of the functions such as 
domain: co- domain: differentiability, and so on, the A-adding further informa- 
tion and postulating’ relations between the functions) then we may define a 
predicate T by stipulating 


xisa Tif * (1) 
= <D,f, F> 
and 
a 
and 
°A 


where the lower indices have been omitted for simplicity. 

Applying the predicate to a concrete situation <t (denoted by a name or a 
singular description) in which the f are replaced by concrete constants f (if 
c is a particular situation containing Jupiter, f position, then the f“ are the 
positions of the first, second, . . . jth ingredient of the situation, including the 
position of Jupiter) we obtain : 

cisaT (2) 
which is said to describe the i-th application of T. A measurement of a concrete 
function ¢ that appears in the ?-th application of T is said to depend on T iff 
(roughly) there exists an xe D‘ such that in every available exposition? of T 
the description of the measurement that leads to a value of #(x) involves a cr 
such that ct is a T. The ‘abstract’ function ¢ (position, if ¢4 is the position of 
Jupiter in the i-th application of the theory) is said to be T-theoretical iff the 
measurements of the ¢ depend on T for every application of T, non-T-theo- 
retical otherwise. Models of T (class MT) are entities that satisfy T, possible 
models (class MZ) satisfy T’ where T” is defined as T, but without the A, partial 
possible models (MZ, T) satisfy T” where, T” is defined as T, but without the A, 
the F and those a that specify properties of the F alone. Thus, given T as defined 
in (1), <D*, f’, F*) is a (possible) model of T corresponding to the i-th application 
of T and (D', f+) the corresponding partial possible model. Partial possible 
models of a theory may be regarded as the facts of this theory, or as T-facts.4 


1 The A usually coincide with the axioms of the statement view. All the definitions in this 
review are (sometimes vastly) simplified versions of definitions that occur in the book. 

1 Stating an application in this sense always involves the entire theory, not just some of its 
consequences. 

* Expositions explain the methods used for obtaining values of functions, use of physical 
laws included (p. 50). They change with the invention of new measuring instruments 
and the discovery of new laws and the boundary line between terms which are T- 
theoretical and non-7-theoretical changes accordingly. 

* For a single domain with a finite number of individuals and a single non-theoretical 
function a partial possible model consists of the individuals, each associated with the 
appropriate value of the function which in mary cases is à real number, or a range of real 
numbers. Thus the system consisting of Jupiter and its moons at a particular time, with 
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The factual content of a theory might now be explained by statements of the 
form (2) were it not the case that the correctness of such statements can often 
be ascertained only with the help of other statements of the same kind (pp- 64 
f., 69).1 Statements concerning an a e MZ, on the other hand can be ascertained 
independently of the theoretical framework of T. To relate them to T we proceed 
‘from the bottom up’ by selecting all those possible partial models that can be 
supplemented by further functions so that they become models of T: we are 
using statements of the form 


. (Vx)(xEa & x is a T) ‘ ; (3) 
where 
xEa (x is an extension of a) iff 
. a= <D, f> 
relative positions specified within a certain limit of precision can be a partial possible 
model of Newtonian mechanics. 

Stegmueller calls partial possible models ‘observable facts’ (pp. 66 and passim) or 
‘physical systems’ (pp. 100, 147, 200, nn. 50, 187 etc.Pand he calls ‘empirical’ all investiga- 
tions dealing with non-theoretical magnitudes (p. 69). It is important to see how these 
notions of observability and empiricity differ from the epistemological notions that have 
so far dominated investigation’ in the philosophy of science. 

The epistemological notions arose in connection with problems of meaning and con- 
firmation. One postulated statements that were intrinsically meaningful and conclusively 
verifiable and tried to explain the meaning and the empirical support of other statements 
in their terms. Statements of the first kind were called observation statements, statements 
of the second kind theoretical statements. 

The distinction was retained by thinkers who did not accept intrinsically meaningful 
and conclusively verifiable statements but who still thought that some statements, being 
further removed from directly (though not conclusively) verifiable statements were more 
doubtful than others. The new version of the dichotomy that arose in this way contained 
two elements viz. (1) the (logical) fact that the examination of some statements of a theory 
involves other statements (of the same theory) while the examination of other statements 
does not and (2) the (psychological) fact that some statements are ‘packed with per- 
ception’ while others are not. The elements are independent. Perception packed state- 
ments of a theory may involve other statements of the same theory (example: psycho- 
analysis) while statements only loosely connected with perception may be tested without 
reference to the theory (example: the position of one component of a spectroscopic 
double star that is being examined as an instance of Newton’s theory of gravitation). 
Stegmueller resolutely separates the two elements, concentrates on the first and defines 
‘observable fact” and ‘empirical’ accordingly (pp. 66, 69). In this he seems to be in 
agreement with scientific practice for pereeption plays a negligible role in mathematical 
physics (cf. my [1969] as well as my [1960]). It is advisable to show the break also in the 
terminology. I shall therefore use the term ‘factual’ instead of ‘observable’, or ‘empirical’. 

Instead of ‘observable facts’ Stegmueller also speaks of ‘physical systems’. This 
term is misleading for different reasons. It suggests that we are dealing with objects 
while in fact we are dealing with objects ‘that are described in a certain way, viz. using 
non-theoretical functions’ (p. 187). Again ‘fact’, or ‘T-fact’ seems to be a more appropriate 
term, for a fact is always objects in a certain situation. Of course, even now it ‘is easy to 
fall back into naive realism when dealing with such entities and to give in to an intuition 
that ... might be described as follows: “the elements of M,, are the things that are 
lying about in the world out there...’ (p. 233). We must always keep in mind that T- 
facts are relative to theories: asserting their existence makes sense only after a theory 
with clear and unambiguous methods of measurement has been specified. 

1 This problem does not arise in the statement view. Here the empirical content of T 
(with ‘empirical’ used in the traditional, ‘mythological’ (p. 56) sense) consists of all 
statements that are both observational, and consequences of T. The first condition 
(observability in the old sehse) can be checked independently of the theory under 
review, the second by logical analysis. 


t 
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and 

ace M , $ 
and there are F such that 
; . x= (D,f, F> 
whose truth can be determjned, first, by examining whether certain objects 
belong to the D, then by fixing the values of the f by factual research and finally 
by investigating whether these values obey the conditions imposed by the F. 

Statement (3) disregards (a) that a theory is applied differently in different 
domains, (b) that different special laws are valid in these domains and (c) that 
the transition from one application to another does not change the values of the 
F? 

(a) is taken into account: by splitting every doman into (possibly overlapping) 
subdomains D? with f* and F* as the corresponding function, <D*, f*) the corres- 
ponding possible potential models (called intended applications of T) and 
<D, ft, F*> the corresponding models: (3) is said to hold for every <D‘, f*>. 
(c) is taken into account by demanding that the F* obey constraints such as 
F(q) = Fk(q) for any ge D? n DE, (b) is taken into account by restricting the 
predicate T in each application in accordance with the laws asserted to be valid 
in this application and demanding that the MZ, be models of the predicates so 
restricted. Modified in accordance with (a), (b), and (c) statement (3) now 
reads (p. 100): ‘there exists a class of 7-theoretical functions satisfying a class 
of constraints such that all partial potential models belonging to the intended 
applications of the theory can be extended into models of T in such a manner 
that the elements of certain subsets of the set of intended applications can be 
extended into models of certain restrictions of the basic predicate’ and it ex- 
presses the factual claim of T. Note that this claim is a single indivisible state- 
ment and not a class of independent statements as in the statement view. The 
indivisibility is due to the constraints which ‘do not permit the factual content 
of a theory to fall apart into special hypotheses’ (p. 104). Note also that the-factual 
claim of T is different from T itself, and that it changed with the applications, 
laws, constraints. 


3 BASIC CONCEPTS, ‘OBJECTIVE’ FORMULATION 


Theories can also be defined in an ‘objective’ way, independently of linguistic 
expressions. We start with a pair f 


S, ED (4) 


consisting of a mathematical structure S and a set of intended applications J. 
The structure S consists of a core which contains the mathematical structure of 
the theory in the proper sense as defined by M, the class of its models, a function 
R related to the distinction between theoretical terms and non-theoretical terms, 
constraints C, and various expansions of the core which introduce applications 


1 Cf. the application of classical particle mechanics (a) to the moons of Jupiter and (b) to 
the problem of capillarity (Laplace). 

2 The mass of Jupiter remains the same whether we now calculate the effect on its moons, 
or upon Saturn. 

3 All these ideas are defined ‘objectively’, i.e. ineset theoretical terms. Thus C(X) is a 
constraint on the class X iff C © Pot(X) and Ax[x €e X->{x} eC]. X is a law for an 
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and the laws and special constraints that are valid in them.t The core may be 
regarded as the ‘theory itself’ while the expansions show the theory at work 
under special conditions.. Structure and intended applications are connected in 
the factual proposition ` 


Ie A(E) . © (5) 


which asserts that the intended applications of the tlfeoty are facts admitted by 
the theory.” (5) uses the first element of (4), the structure, for making an assertion 
‘about the world’ as presented in the second element. ‘[I]t contains everything 
one might usually find in a comprehensive treatise’ that deals with ‘the empirical 
content of a theory of mathematical physics at a particular time, including all 
the special laws assumed to be valid at that time’ (p. 138). 

Reduction of T = <S, D to T’ = <S’, I’) involves (a) a correlation of J and I’ 
such that to every x e { there corresponds at least one x’ e J’ and (b) a correlation 
of ‘explanations’ such that to every explanation of an X e I in terms of T there 
corresponds an explanation of x’ eI’ in terms of T” (p. 145). Explaining an 
x é MZ, in terms of T here means extending it so that it becomes a model of T 
taking proper care of constraints and special constraints (p. 113). An a and an x’ 
that are correlated and whose explanations are correlated in the manner just 
described are said to be the same fact, but described differently (p. 145).° 


4 KUHN SNEEDIFIED 


This inventory of concepts allows us to draw a clear distinction between a theory, 
and assumptions about it: the theory is either the pair (4), or S, or the core of S 
while the various assumption about it are expressed with the help of factual 
propositions. Thus we may assume, in an entirely ‘unforced’ way (p. 191) that 

the theory remains constant while there is a series of expansions F, F’, E”, 

... such that A(E) © A(E’) S A(E”) etc. with I e A(E*) for every E* in the 

series, : 

S = (My, Myy, R, M) iff X c M,. The function R turns possible models into partial 
possible models, or classes of possible models into classes of partial possible models, or 
classes of classes of possible models into classes of classes of partial possible models, 
depending on context. Finally, X is a corg for a theory iff there are Mp, Mpp, R, M, C 
such that X = <M,, Mps, R, M, C) and a domain D and functions f from D into R 
such that x e M, iff X = <D, f) and M € M, with C a constraint on M,, and R as 
just explained. 

1X is an expanded core iff X = (M,, Moy, R, M, C, L, Cr, p> where <M,,..., C) is 
a core, L contains all the laws for <M,, Mp», R, MY, Cy is a constraint on the theoretical 
functions of the laws that restrict the theory in its various applications and p a relation 
that correlates the special laws of L to the intended applications in which they are sup- 
posed to work, 

An expansion of a core X is an expanded core that shares with X the first quintuple 
of entities. 

3 The factual proposition is the proposition corresponding to the factual claim of a theory. 
Facts ‘admitted by a theory’ are those facts which an addition of theoretical functions 
satisfying C, Cr extends into possible models that are models of the basic mathematical 
structure and satisfy the laws L (p. 148). The function 4 is defined in such a manner 
that’ it produces the facts admitted by the theory whose expansion forms its argument. 
Thus A(core) is R[Pot(M) n C]. 

* This account of reduction i$ a generdlisation of some suggestions first made by E. P. 
Adams. 
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Nor can the apparent immunity of theories (paradigms) during periods of 
normal science now be regarded as an objection. One only needs to realise that 
theories as defined in (1) and (4) are not the kinds of entities that can be refuted: 
(1) is a predicate, (4) is a pair, consisting of a structure and a class. Yet, though 
the theory itself cannot get in conflict with facts! it can be used to make state- 
ments, such as its factual] claim, that do get in conflict with facts.* There exists a 
relation between theories and facts that may influence the fate of the theories, 
but this relation is more complex than is assumed in inductivism, falsification- 
ism etc. ' ; 

For example, given I ¢ A(E*), we may retain <S, I> and drop E*. The 
procedure is ‘strictly rational’ (p. 214) as the faulty assumptions are contained 
in the expansion, not in the core.? It shows how Popper’s falsificationism can be 
‘reconciled’ with Kuhn’s insistence that anomalies do not touch paradigms. 

Defining J as a ‘Wittgensteinian predicate’* provides a further possibility of 
preserving <S, I> in the face of a failure of factual claims. We may now remove 
a fact e I—I, (I, the set of pragmAtic cases that determine J) that is not admitted 
by some expansion of the theory without violating rationality. It would be 
irrational to remove elements of an J that is determined by a property without 
changing J itself. It would be equally irrational to justify such a procedure by 
reference to ‘what is actually done by scientists’ thus introducing a conflict 
between ‘reason’ and ‘irrational history’ (p. 199). But it is rational to let history 
suggest a reconstruction whose procedures are both realistic, 7.e. in agreement 
with what is done by scientists, and rational, i.e. in agreement with the internal 
rules of the reconstruction. Stegmueller claims that the use of Wittgensteinian 
Ps is realistic (scientists do not subscribe to an ‘exaggerated rationalism’ — p. 199) 
and that the second method of preserving <S, D is rational in such a reconstruc- 
tion. 

Instead of removing Ts that are excluded even by weak expansions we may 
add Ps that are admitted by strong expansions and lend factual support to the 
corresponding claims. In both cases we have a set of paradigmatic facts J, and 
the theory determines the elements of J—J, (p. 225). Occasionally this self- 
determination leads to facts that have no prior similarity with any of the J,—and 
that is quite legitimate, for the similarity may lie in some theoretical properties. 


1 Problems of confirmation and test in the usual sense ‘disappear’ in the case of theories 
(p. 247). 

2 But they ‘fully apply’ to the factual proposition of a theory (p. 247). 

3 It is assumed that J € A (core). 

1 A ‘Wittgensteinian predicate’ is a predicate whose extension is determined neither by a 
list, nor by a property, but by a set of paradigmatic examples plus similarities between 
these examples and a new case that are not all stated in advance, but may be ‘discovered’, 
or decided upon the moment the new case comes up for consideration. 

Stegmueller assumes (with some reservations: p. 227) that the list of paradigmatic 
cases is (1) absolutely fixed and that (2) the inexactness of the remainder concerns the 
sufficient, but not the necessary conditions of membership (p. 196 f.). Thus in the case 
of games football, cricket, poker would all be paradigmatic examples while ‘being a human 
activity would be a necessary condition that is never violated. This restriction corres- 
ponds neither to Wittgenstein’s intentions, nor to the nature of the concepts. Poker 
would cease to be a game if it became a mania, like St Vitus’s dance, and games have also 
been found among animals, Altogether it seems that Wittgensteinian predicates cannot 
even be partly formalised. Every new application*involves 4 linguistic decision and there- 
fore amounts to a partial redefinition. 
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Thus there is little in common, prima facie, between the motion of the moon, 
the motion of a Brownian particle and the properties of light and yet there are 
expansions of the core of Newtonian mechanics that produce facts in all these 
domains and whose factual propositions are confirmed by further research 
{p. 228 ff.). Autodetermination does not mean autoverification, however, for a 
‘natural extension’ of the facts subsumed (addition, of, the laws of interference 
to the laws of reflectioh, refraction, rectilinear propagation in the case of light) 
of a ‘splitting up’ of those facts (finding of more precise methods for determining 
the path of the moon) may again necessitate their .removal from J. All these 
procedures leave the theory untouched and thus provide ‘a clearer and more 
differentiated idea of what the . . . immunity of a theory in the face of “‘recalci- 
trant experiences” really amounts to’ (p. 230). . 

The stage is now set for giving an account of ‘the stable and the variable 
elements of a period of normal science. The stable elements are partly logical, 
partly pragmatic. The logical elements are the structure of the theory used as 
represented by its core (pp. 203, 220) and the class of paradigmatic facts I). 
The pragmatic elements consist in the choice of a core and an Jj, in the belief 
that I, e A(core)—this may include an account of the ‘historical origin’ of the 
theory (p. 223}—the belief that for at least some expansion and some I # Jp, 
Ie A(E) and that there is evidence for this proposition as well as the belief that 
progress will occur, i.e. there will be Jẹ e T such that {J+} e A(Z) and E* such 
that A(Z) = A(E’) etc. with I e A(E4 for all i concerned (p. 214). The variable 
elements are the new I’s, the new E’s, the new evidence provided by the examina- 
tion of the new claims (p. 231). What decides the success of a paradigm is the 
discovery of £’s that start a period of unimpeded growth or at least a period 
where faulty claims can be eliminated by a change of expansions so that there is 
overall growth despite temporary setbacks. As the number of possible expansions 
is at least Aleph-one ‘any assertion concerning the failure of a theory is finally 
based upon the decision to stop a further search’ (p. 226). This is how the im- 
munity of a theory in Kuhn’s account can be rationally explained. There is only 
one ‘absolute limit’ and it lies in the paradigmatic cases: ‘if a theory fails in these 
cases, then it must be given up or else we face a complete collapse of what could 
be meant by its “domain of application” ’ (p. 222)!— but even here we may 
‘question the sharp delimitation of the.class of paradigms’ (p. 227).? Considering 
the many escape routes that are at dur disposal we see that the possibilities of 
eliminating a theory during a period of normal science are severely restricted. 
Even in extreme cases elimination depends on rules whose rationality can always 
be questioned. The situation is best described by the slogan: tn dubio pro theoria 
(p. 230). 

Theory change is accounted for by distinguishing between (1) the first arrival 
of a physical theory in a certain domain and (2) the replacement of a fully 
fledged physical theory by another. The first process has two stages.® Stage one 


1 Stegmueller does not deny that ad hoc hypotheses may be used to save even such cases, 
but their discussion, he says, belongs to the ethics of science, not to its logic (p. 226). 
He also regards as an ‘exaggeration’ Kuhn’s belief that a paradigm can be given up only 
when an alternative is available. 

2 Stegmueller quotes the tides as a case that may, and may not, be regarded as an Ip for 
Newtonian mechanics (p. 227). . 

3 What follows is not meant to be an historical assertion, but a systematic account: i, p. 468. 
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introduces terms which are not tied to the observational level by fixed conditions 
and fixed results but allow for the interpolation and the exchange of auxiliary 
hypotheses (i, pp. 233 ff.—modified). Terms of this kind as it were float above 
the observational level and are connected to it now by the one, now by the other 
factual assumption. They are called theoretical terms in the weak sense (i, 
p. 239).1 This stage is captured by the two-language model. In stage two theo- 
retical terms have become part of a theoretical structure Such that each applica- 
tion involves other applications of the same structure. We are now forced ‘to 
think of theories as complicated mathematical structures instead of regarding 
them as classes of statements’ (p. 240). Pre-theoretical ideas which have the 
psychological advantage of being ‘historically tied to the specific terrestrial- 
human experience’ (p. 236) still survive in stage two in the form of constraints. 
They offer greater resistance to change than do theoretical laws (p. 243). 

Process (2) has a practical side which has been unjustifiably combined with 
theoretical assumptions implying irrationality. The practical side is that a theory, 
being not a statement but a complex instrument for the production of statement, 
serves a purpose even if it serves it badly. It is as reasonable to retain a leaky 
roof until a better roof has been found. Of course, we cannot prove that man 
should behave in this way, but this is no disadvantage as we are not dealing with 
a case for which a proof is required. Practical considerations suffice and that is 
all that can be said (p. 247). 

It does not follow, however, that considerations of test have now been elimin- 
ated by appeal to history, or practical usefulness, or that the distinction between 
a context of discovery and a context of justification has lost its point. Both the 
considerations and the distinction remain in full force where they are applicable, 
i.e. with statements such as the various factual claims of the theories. Kuhn, says 
Stegmueller, makes a double mistake. He first interprets theories as empirical 
statements and he then asserts that history forces us to treat these statements 
as if they were not empirical statements after all, e.g., as if they were not sub- 
jected to tests. Distinguishing between the statement—and the non-statement 
elements of sciences removes the error and restores rationality (p. 247). 

Furthermore in arguing for incommensurability, Kuhn commits a ‘logical 
mistake’ (p. 248) and his argument is hardly more than a ‘joke’ (p. 249): in trying 
to defend a thesis of his own he uses the statement view which belongs to his 
opponent. Instead of inferring incommenéurability from non-derivability one 
has to realise that reduction (of one theory to another) cannot be defined in 
terms of derivability but must be viewed in the manner explained towards the 
end of the preceding section.” This new idea of reduction removes a sizeable 
‘gap of rationality’ (p. 250) and completes the demonstration that Kuhn’s ideas 
contain, apart from mistakes, exaggerations and non-logical components, a 
comprehensive new conception of science that can be presented in a rational 
manner if only the proper logical instruments are used. Even the altercation of 
paradigms which during a revolution replaces the empirical rivalry of the 


1 The use of theoretical terms in the weak sense involves idealisations and a rich net of 
empirical generalisations. i 

3 Stegmueller also accuses Kuhn of a ‘transgression of competence’: it is not up to the 
historian to assert, or to deny that one theory is reducible to another theory—that is the 
task of the logician. The task is difficult and carfnot be solved with the help of a few 
informal remarks (p. 249). 
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different expansions of a single theory by an ‘a priori battle’ is finally decided 
in a perfectly rational way, in favour of the theory that does everything its pre- 
decessor did and adds some contributions of its own (p. 251). 


e 
- 5 ADVANTAGES AND DISADVANTAGES OF THE STATEMENT VIEW 


One of the advantages of the Sneed—Stegmueller reconstruction is that it puts 
into relief certain features of science that almost disappear in the statement view. 
One minor example which is not mentioned by Stégmueller is the role of dia- 
grams and models: chemical formulae are compared and combined according 
to strict rules but it would be somewhat artificial to’regard them as statements. 
Of course, they can be used to produce statements, but they are not statements 
themselves and transformations leading from one formula to another do not 
go through a statement phase. An even more important example is the role of 
a priort elements in our knowledge. Categorigs, forms of perceptions, are struc- 
tures which again give rise to statements (Kant’s synthetic a priori statements) 
without being statements themselves. Stegmueller mentions Kant and expresses 
the expectation that his a¢count will come closer to Kant’s aim ‘of reconciling 
the a priori component and the empirical component of the scientific process’ 
(p. 252). On the other hand there are properties of scientific change that are 
much better explained by the statement view (I shall soon mention such pro- 
perties). So, let us examine the arguments which Stegmueller offers for his mode 
of reconstruction. 

There are essentially two such arguments: (i) the statement view makes Kuhn 
appear irrational while Stegmueller’s procedure does not; (#) the statement 
view has problems which do not occur in its alternative (role of theoretical 
terms; of a priori elements; incommensurability; etc.). (i) can be defused by 
showing (A) that the features rationalized by Stegmueller occur in Kuhn but 
not in science and/or (B) that these features can also be explained by the state- 
ment view. (ii) can be defused by showing how the problems can be solved within 
the statement view. I start with the discussion of (i)B. 

For Stegmueller one of the most telling arguments in favour of his reconstruc- 
tion is that it enables us to give a rational account of the apparent immunity of 
paradigms in the face of recalcitrant évidence: theories as defined in (4) cannot 
get into conflict with experimental results. 

But such theories are not entirely separated from the facts. Given a set of 
facts, a claim that is contradicted by the facts, and certain methodological de- 
cisions, it may be very difficult to retain the theory that produces the claim (see 
the preceding section especially text to footnote 1, p. 357). The theory may have 
to be eliminated not because it is refuted but because there exists a situation 
involving facts which, given the methodological decisions, demands its removal.+ 
Theories as reconstructed by Stegmueller are therefore not immune to removal, 
they are not even immune to removal by facts, they are only immune to certain 
types of removal, such as refutation. The difference between the statement view 
and the Sneed—Stegmueller method of reconstruction is not that theories are 
eliminated in the former while they stay around forever in the latter. The 


1 Cf. text to footnote 1, p. 357. 
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difference rather lies in the circumstances that bring about the demise of a theory. 
Facts play a role in both accounts, but they threaten theories in different ways. 
Also the immunity that is granted a theory by Stegmueller is greater than the 
immunity granted it by some of the more simple minded methods of the state- 
ment view (naive falsificationism, for example). But there are other statement- 
view methodologies, such gs the methodology of research programmes, where 
the removal of a theory is just as difficult as in Stegmueller. The argument-from 
immunity, taken by itself, theretare does not put Stegmueller ahead of the state- 
ment view.! 

The same is true of the other features discussed in section 4. Theory and 
belief can be separated by-showing how the theory remains constant while its 
probability changes in different conditions and developments in normal science 
can be explained accordingly. 

Nor does the arrival of theoretical terms in the strong sense ‘force us’ (p. 240) 
to abandon the statement view. It is true that the connections between such 
terms give rise to rather complex structures but there is no reason why the con- 
cept of a theory should be tied to these structures rather than to the statements 
expressing the connections. Quite the contrary, it seems that the statement view 
has here some real advantages. In Stegmueller we have two kinds of change, 
viz. (normal) changes with constant core (and varying intended applications 
and expansions) and (revolutionary) changes from one core to another and 
therefore from one paradigm to another. But in actual scientific practice there is 
a further kind of change to be considered wiz. a modification of the core that 
leaves the corresponding paradigm unchanged. To explain it let us assume that 
a paradigm is constituted by a cluster of statements (and methods) that is deter- 
mined by a Wittgensteinian predicate. In this case there can occur a rather 
drastic exchange of statements without change of paradigm (cf. p. 356, n. 4, 


1 To see the immunity-propensities of the statement view let us consider the trite ‘law’ 
A All ravens are bluck (S) 


and let us assume that test statements involving colours are statements asserting that 
some object possesses a certain colour. 

Now S is not refuted by a raven that has been painted white or by a raven that lost its 
colour because it passed through noxious fumés: ‘black’ in (S) means ‘intrinsically 
black’, i.e. ‘black due to internal circumstances’. Nor do all internal circumstances 
contribute equally to intrinsic blackness. (S) i& not refuted bya raven that has become 
pale because of a rare disease; and it is not refuted if it should turn out that the blackness 
of all ravens is due to a contagious disease that has lasted for centuries and is transferred 
from raven to raven by birth (though the last case exposes some ‘intrinsic’ vagueness in 
the content of (S); one might decide to say that ravens, after all, are white, but that their 
whiteness had never a chance of coming to the fore.) We have to conclude that test 
statements as defined above can never conflict with (S): (S) is immune to refutation ‘by 
observation’ as the observed colour of a raven does not decide the question as to whether 
this colour is intrinsic or not. 

On the other hand, (S) used together with statements describing conditions (weather, 
genetic make-up, etc.) in various domains D, D’, D” etc. entails statements (S’), (S’’) 
(S’”) that are refutable by observation. If DCD’ CD” ete. then content (S) S 
content (S) & content (S’”) etc., i.e. the Si will be increasingly falsifiable. It is easily 
seen how definitions such as these can lead to an account of the progressive elements of 
normal science that parallels the account of the Sneed—Stegmueller view. Immunity 
increases when we choose more ‘interesting’ laws such ag Ohm’s law, or the law of 
inertia, or the law of universal gravitation. All these laws are immune to refutation unless 
combined with elements which make them produce statements which are refutable. 


e 


Changing Patterns of Reconstruction 361 
second paragraph). Let us also assume that the entire cluster is applied differently 
in different dqmains, that there are (clusters of) special laws and special constraints 
which vary from one domain to the next. Then the core of reconstruction (4) 
which depends on the statements in the cluster (cf. the definitions following (1) 
as well as the definitions in p. 354, n. 3) may ghange without a change of para- 
digm. Now if there are episodes in science that cpnform to the pattern just 
described then we have a difficulty for the Sneed-Stegmueller view and an 
argument (in the sense explained in the paragraph following n. 4, p. 356) for 
the statement view. Of course, the argument would not show that the statement 
view is untversally valid, it would only show that it is preferable on some occasions 
(though perhaps not on others). . 

Now I think that there are such episodes. The early quantum theory can be 
regarded as a paradigm (and was regarded as a research programme by Lakatos) 
but there is no fixed underlying structure, no core in the sense of Sneed and 
Stegmueller, very fundamental assumptions such as the law of conservation of 
energy and momentum may be dropped and picked up again, and this is not 
just in special domains, as would be the case with special laws and expansions, 
but wherever the paradigm is applied. Moreover, this feature can be found not 
only in this particular period of the history of science which was rather unruly 
but in more settled periods as well. The statement view as exemplified in 
Lakatos has ready machinery for dealing with it: Lakatos’s core which is a 
loose cluster of statements is not as rigidly defined as the core of (4)! and permits 
precisely the kind of statement exchange I have in mind. Then there is the 
protective belt of special laws and conditions whose modifications create the 
phenomena Stegmueller wants to reproduce: relative immunity of the paradigm, 
‘normal’ progress and failures, advances, retrenchments, tentative applications 
in different areas. Result: Stegmueller’s (i) can be defused by showing (1)B 
that the features of Kuhn that do correspond to actual science are also captured 
by the statement view, and without major contortions, while there are some other 
features where the statement view has definitely the upper hand. Atiding (i)A 
that normal science as described by Kuhn is practically non-existent,” that 
many changes which Kuhn classifies as normal and Stegmueller reconstructs 
with the help of a stable core and varying expansions leading to progressive and 
regressive developments, to increasingly sharp claims and retrenchments are 
changes of the new kind just described strengthens the case for the statement 
view. Turning to (#) we find further arguments for it. The application of a 
theory only rarely involves the entire structure, quite the contrary: the aim is 
frequently to stay clear of certain parts of the theory as long as possible. Of course, 
I don’t deny that the Sneed—Stegmueller account fares well on other occasions 
but such an admission, far from giving comfort to Stegmueller’s programme, 
undercuts it in a fundamental way: it undercuts the whole idea of reconstruction 
by suggesting that we may have to use different reconstructions on different 
occasions and that the assumption that a single scheme (statement view; 


1 Déciding for a paradigm in the sense of (4), says Stegmueller, is an ‘all or nothing 
decision’ (p. 273). 

2 Cf. my [19706]. Stegmueller defends normal science (p. 303), but only by referring to 
Kuhn and his own reconstruction. 
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structural view; Hegelianism) will cover all of science, all of physics, or even 
all of quantum mechanics may be nothing but a pious dream. 

Stegmueller compares the work of the reconstructionist with the work of the 
scientist (p. 7). The scientist wants to give an account of the world and in doing 
so creates knowledge. The philogopher of science/epistemologist wants to give 
an account of this knowledge and in doing so creates a ‘second (layer of) 
rationalization’ (p. 7) including formal logic, inductive logic, reconstructions. 
Aristotle’s work, says Steger (p. 305) is an exellent example of research of 
the second kind, 

Now contrary to what Genuae seems to feli I have no objection to 
anyone doing work of this kind, but I have a series of criticisms not of the 
enterprise itself, but of various -features of it that have become rather prominent 
in the past few decades. ` 

First of all I regard the comparison with Aristotle as very misleading. It is 
true that almost all of Aristotle’s 3 general work and many parts of his special 
treatises are second rationalisation (of common knowledge: cf. W. Wieland’s 
excellent book Die Aristotelische Physik). But this work proved fruitful for 
science, that is for the various activities and results that constitute the first 
rationalisation. It not only revealed a pre-existing structure of science, it also 
made suggestions for structural reform and added to our knowledge of the laws 
of nature (Aristotle’s theory of motion contains conceptual clarification as wr 
as physical laws). 

Aristotle tried to understand. But in his attempt to understand he dinkised 
science (the same was still true of Mach). By contrast the understanding aimed 
at in the philosophy of science of today is barren, it leads neither to new laws, 
nor to suggestions for scientific reform. Moreover this barrenness is not an 
accident, it is not merely the result of a considerable shrinkage of talent, it is 
proudly defended as a sign that the philosophy of science has at last become an 
independent subject and can claim a professional ethics of its own (some asides 
by Stegmteller indicate that he is not at all averse to such a defence). I agree 
that not all research in an area need have consequences outside of it but I start 
getting rather suspicious when seeing that the absence of outside fruitfulness is 
not regarded as a problem and is even welcomed as a sign of maturity. So, by 
all means, let us have understanding, but an understanding that bears fruit and 
is not content with a posture of loquacious gaping. `` 

My second criticism is that the second rationalisation is still in a stage which 
the first rationalisation has left behind long ago. Knowledge, in the first rational- 
isation, once meant absolute, provable, universal knowledge. Today we know 
that there is no hope of proving theories in the empirical sciences, we are content 
with hypotheses instead of ‘eternal laws’ and we have realised that even the most 
‘basic’ laws may be subjected to spatio-temporal change. Compared with this 
the attitude of most ‘metascientists’ is much more rigid. True, there are splendid 
exceptions such as Carnap (whose ‘principle of tolerance’ is pretty close to my 
own anarchism), but even he demands that clear formulations be used at all 
times and so neglects examining, or proposing rules for the long stretches of 
research that consist in the gradual development of new concepts. While in 
science concepts are given time to develop metascience,demands that they spring 
into the world like Pallas Athene from the forehead of Zeus. 
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Third, the analogy between the first and the second rationalisation as drawn 
by Stegmueller and his objections to mixing subjects (pp. 246, 249, 310) argue 
against the logical approach in metascience: if the second rationalisation 
rationalises knowledge like the first rationalisation rationalises the world then 
the proper niethod to be used in it is the historical method, or the anthropological 
method, for knowledge is a historical/anthropofogieal/eyen cosmological (p. 232) 
phenomenon. : . 


6 INCOMMENSURABILITY AND THE RATIONALITY OF 
THEORY CHANGE 


Stegmueller’s presentation of problems, his expdsition of views different from 
his own, the solutions he suggests, the explanations that accompany the solutions, 
his asides are always clear, simple, and they show an admirable grasp both of 
traditional and of modern forms of thought, One must admire the paternal 
patience with which he discusses extreme positions such as mine and tries to 
make sense of them. He has many valuable things to say about the nature of 
knowledge, methodologies, general procedures and some of his warnings are 
going to be taken to heart even by this ‘fun loving’ (p. 305) reviewer. It is true 
that in his effort to look at things from as many sides as possible he occasionally 
becomes repetitious: there are informal informal explanations followed by in- 
formal formal explanations followed by formal formal explanations and I often 
did not see the point of a formal definition with all the «’s and y’s in the proper 
place when an informal statement could have done just as well (hardly any of 
the definitions are used as a starting point for the derivation of novel theorems 
and thus of fruitful knowledge; the most we get are lemmas for further defini- 
tions)—but these are minor faults. The reader can always skip the repetitions 
and formulae without losing track of the main argument. There is only one 
place where Stegmueller’s virtues seem to be almost entirely absent and this is 
in his discussion of incommensurability. Apparently everyone who enters the 
morass of this problem comes up with mud on his head, and Stegmueller is no 
exception. He gives a misleading account of the phenomenon, he lumps together 
what different authors have said on the matter, he misrepresents them, and he 
suggests a solution that is hardly satisfactory, both from a logical and from an 
historical point of view. So let me once more state what is involved in the problem 
and what follows from it. I shall give two accounts, Kuhn’s account and my own, 
for these have been developed independently, and they are also fairly different. 

Kuhn has observed that different paradigms (A) use concepts that cannot be 
brought into the usual logical relations of inclusion, exclusion, overlap; and (B) 
make us see things differently (research workers in different paradigms have not 
only different concepts, but also different perceptions) ; and, (C) contain different 
methods for setting up research and evaluating its results. According to Kuhn 
it is the collaboration of all these elements that makes a paradigm immune to 


1 Thig part was argued with vigour and many examples by the late N. R. Hanson ([1958]). 

2 One of the motives that prompted Kuhn to reconsider the standard paradigm of scientific 
change was his discovery that Aristotle’s physics, while different in many respects from 
modern physical theory still gave an orderly, and therefore ‘rational’ account of a great 
variety of phenomena. 


AA 
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difficulties and incomparable with other paradigms. Incommensurability in the 
sense of Kuhn (incg for short) is the incomparability of paradigms that results 
form the collaboration of (A), (B) and (C). Stegmueller discusses only area (4),! 
shows that there can be comparability despite conceptual disparity and seems 
to assume that incommensurability in Kuhn’s sense has now been done away 
with. But of course it has adt, for incx is not merely conceptual i be His 
discussion of Kuhn is.therefore defective at a decisive point.® 

As opposed to Kuhn my own research started from certain problems in area 
(A) and my discussion of these problems was restricted to a fairly narrow 
domain.’ Both in my thesis (1951) and in my first English paper on the mattert 
I asked how observation statements were to be interpreted. I rejected two ac- 
counts viz. the pragmatic theory according to which the meaning of an observa- 
tion statement is determined by its use, and the phenomenological theory accord- 
ing to which it is determined by the phenomenon that makes us assert it as true. 
I replaced them by the principle, that the interpretation of an observation lan- 
guage comes from the theory that explains what we observe, and changes assoon 
as this theory changes.® I realised that the principle might make it impossible 
to establish deductive relations between rival theories and I tried to find means 
of comparison that were independent of such relations.’ In the years following 


1 He admits there are many aspects to a paradigm (p. 206) but he restricts his discussion 
to area (A) and some pragmatic notions. Differences of methodology and perception 
are never mentioned which is surprising for he realises that the use of theories depends 
on a variety of questionable methodological rules and decisions (pp. 226, 230). 

8 According to Stegmueller the exchange of theories during revolutions is guided by the 
principle that the new theory must at least achieve as much as its predecessor where 
achievement is measured by number of facts explained (p. 251). This assumes that the 
rival paradigms both regard this as a necessary condition of excellence, but such is not 
always the case. What counts in Aristotle is conformity with laws that have been ab- 
stracted from ‘normal observation (and speech)—<f. ch. 12 of the German edition of my 
Against Method (Wider den Methodenzwang Frankfurt 1976). What counts for Einstein 
is harmorfy of the whole which he more than once puts above ‘verification by little effects’ 
(cf. Against Method, ch. 5, n. 9). 

Lakatos too thinks he can rationalise science by discussing only problems of area (4) 
and assuming stable standards of evaluation throughout. He takes it for granted that 
Copernicans and anti-Copernicans used the same methodology (the methodology of 
research programmes, naturally) to evaluate their respective achievement. Thus he never 
reached Kuhn’s very different theory by his Criticism, nor does he pay attention to the 
features of history Kuhn tries to capture. Cf. Against Method, pp. 204 ff. and 240. 

In the simplest form the errors can be found in Popper’s remark that paradigms can 
always be compared. They certainly can—but according to whose standards? And how 
can the standards themselves be compared? Popper gives no answer. 

3 Originally, under the influence of Wittgenstein, I conceived paradigms (‘language 
games’, ‘forms of life’ were the terms I used then) as comprising (4), (B) as well as (C): 
different language games with different rules would give rise to different concepts, 
different ways of concept construction and statement evaluation, different perceptions 
and would therefore be incomparable. I explained such ideas in Anscombe’s home in 
Oxford in the fall of 1952 with Hart and von Wright present. Later on I found it necessary 
to restrict research to be able to make more specific assertions, Kuhn’s book and especially 
Lakatos’s reactions to it then encouraged me to resume the more general approach. 
‘The results are found in chapters 16 and 17 of Against Method. 

t Feyerabend [1958], pp. 143 ff. 

s Ibid, P. 163. 

* Thus in my 1958 paper I tried to give an interpretation of crucial experiments that was 
independent of shared meanings. I improved this account in my [19700], p. 226. 
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my [1958] (which preceded Kuhn’s Structure and appeared simultaneously 
with Hanson’ s Patterns) I tried to specify the conditions -under which two 
theories ‘in the same domain’ would be deductively disjoint! and I tried to find 
methods of comparison’that survive despite the absence of deductive relations. 
When using the term ‘incommensurable’ I always meant deductive disjointed- 
ness, and nothing else. Thus while incg is the Incem arability of paradigms that 
results from the collaboration of (A), (B) and (C), tcp is deductive disjointed- 
ness and I never inferred incomparability from it as Stegmueller suggests 
(p. 167). Quite the contrary, I tried to find means of comparing-such theories. 
Comparison by content, or veristmilitude was of course out. But there certainly 
remained other methods.® f 

Now the interesting thing about these ‘other methods’ is that most of them, 
though reasonable in the sense that they agree with the wishes of a sizeable 
number of researchers, are arbitrary, or ‘subjective’, in the sense that it is very 
difficult to find arguments for the acceptability of these wishes? that do not 
rest on wishes of a similar kind. Also these ‘other methods’ most of the time give 
conflicting results: a theory may seem preferable because it makes numerous 


1 The conditions deal only with theories and their logical relations and thus belong to 
area (A) of the paradigm differences noted by Kuhn, [I believed for some time that 
conceptual differences would always be accompanied by perceptual differences, but 
I gave up this idea in my [19652], text to footnotes 5o ff. (reason: the idea does not agree 
with results of psychological research). In Against Method, p. 238 fÈ I warned against 
‘an inference from style (or language) to cosmology and mode of perception’ and specified 
conditions in which such an inference can be made.] To circumvent the difficulty that 
arises when we want to say that incommensurable theories ‘speak about the same things’ 
I restricted the discussion to non-instantial theories ([1962], p. 28) and I emphasised 
that mere difference of concepts does not suffice-to make theories incommensurable in 
the sense of being logically disjoint. The situation must be rigged in such a way that 
conditions of concept-formation in one theory forbids the formation af the basic con- 
cepts of the other (cf. the explanation in Against Method, p. 269 and the reason, given 
there, why such explanations have to remain vague; cf. also the comparison of theory 
changes that lead to incommensurability in my sense with changes that do not in my 
[19658], section a). Of course, theories may be interpreted in different ways, they may be 
incommensurable in some interpretations, not incommensurable in others. Still, there 
are pairs of theories which in their customary interpretation turn out to be incommensur- 
able in the sense at issue here. Examples are classical physics and the quantum theory; 
general relativity and classical mechanics; Homeric Aggregate physics and the substance- 
physics of the Presocratics and their followers. 

2 There are formal criteria: a linear theory is preferable to a non-linear one because solu- 
tions can be obtained more easily. This was one of the main arguments against the non- 
linear electrodynamics of Lie, Born and Infeld. The argument was also used against 
the general theory of relativity until the development of high speed computers simplified 
numerical calculations. Or: a ‘coherent’ theory is preferable to a non-coherent one (this 
was one of Einstein’s criteria for preferring general relativity to other accounts). A theory 
using many and daring approximations to reach ‘its facts’ may be less likeable than a 
theory that uses only few and safe approximations. Number of facts predicted may be 
another criterion. Nonformal criteria usually demand conformity with basic theory 
(relativistic invariance; agreement with basic quantum Jaws) or with metaphysical 
principles (such as Einstein’s ‘principle of reality’). 

3 Take simplicity, or coherence: why should a coherent theory be preferable to a non- 
coherent one? It is more difficult to handle, derivations of predictions are usually more 
elaborate and if the devil is master of this earth and a foe of scientists (why he should be 
I cannot imagine, but let ws assumeehe is) then he will try ‘to confound them so that 
simplicity and coherence will no longer be reliable guides. 
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predictions, but the predictions may be based on rather daring approximations. 
Or a theory may seem attractive because of its coherence but this ‘inner harmony’ 
may make it impossible to produce many numerical results. Transition to 
criteria not involving content thus turns theory choice from a ‘rational’ and 
‘objective’ and rather one dimensjonal routine into a complex decision involving 
conflicting preferences, and,propaganda will play a major role in it, as it does in 
all cases involving preferences.! Adding areas (B) and (C) strengthens the 
subjective, or ‘personal’ (or sociological) component of theory change. 

Stegmueller does not consider such methods for he thinks he has found a 
different and much more efficient way of bridging the logical gap between 
different theories. According to Stegmueller there are such gaps between any 
two theories, for different cores can never enter into deductive relations. Obviously 
he has to reject the account given of the gap in the statement view and the 
reasons given for the account: ‘paradoxically Kuhn and Feyerabend fall back 
upon the view of their opposition, viz. the statement view when trying to 
establish that a theory that is puslied aside and the theory that does the pushing 
aside are incomparable’ (p. 24). - 

The quote assumes that both Kuhn and I have some argument to show that 
rival theories cannot be deductively related and that we use these arguments to 
infer, or perhaps already to assert incomparability. This is the first error com- 
mitted by Stegmueller. Kuhn of course has arguments showing deductive 
disjointedness, but he does not regard them as sufficient for establishing incx. 
To establish inc and thus incomparability Kuhn uses features from domains (B) 
and (C) as well. I also use arguments establishing deductive disjointedness, but I 
remain content with them, I never proceed to show incomparability as well, I 
only infer that comparison by content is out and that other methods of com- 
parison must be found. 

Stegmueller’s second error is his presumption that in arguing for the deduc- 
tive disjointness of rivals (some rivals in my case, all rivals in the case of Kuhn) 
we commit a ‘logical mistake’ (p. 248). He says that we ‘fall back upon the view 
of the opposition’ as if the ideas we hold independently of the argument (struc- 
ture of paradigms, existence of a normal science in the case of Kuhn; theory 
dependence of observations in my case) amounted to a renunciation of the 
statement view, or as if the statement view had been shown to be impossible 
by independent reasons. But adoption of a normal science or of theory-depen- 
dent observation statements has no implications for the issue between the state- 
ments view and the structural account. Each feature of a paradigm that Steg- 
mueller uses to argue for his type of reconstruction can be produced by the 
statement view and there are features of science where the statement view has a 
definite advantage. This we have seen in section 5. No other arguments have 
been provided. The accusation of a logical mistake therefore rests on a dogmatic 
assumption of the uniqueness of the structural view.? 


1 The issue between coherence on the one side, closeness to experimental results on the 
other played a large role in the debates about the interpretation of the quantum theory 
and it was never resolved in a really satisfactory manner. 

2 Stegmueller continues by saying that ‘thinking in terms of derivability i is the wrong ‘way 
of bringing about a comparison of theories’ (p. 24}—which is an acceptable account of 
my own position and a somewhat leas acceptable gccount of Kuhn’s. But for Stegmueller 
the remark is meant as a criticism of us both. 


Changing Patterns of Reconstruction 367 
According to Stegmueller there is a deductive gap between any two theories. 
The gap is que not to mutually exclusive conditions of meaningfulness but to 
the fact that there is nọ way of establishing deductive relations between cores 
(p. 24). He closes the gap by his theory of reduction. The theory makes 
paradigms comparable, but does not yet turn them into rivals—and it is only 
between potential rivals that reduction in the profes sense can be said to obtain. 
We need additional conditions to exclude cases such as the ‘reduction’ of electro- 
statics to general hydrodynamical theory. Stegmueller has no such conditions 
and his theory is therefore incomplete at a decisive point.1 Moreover, the 
demand for a duplication of the successes of the defeated rivals (p. 251) shows 
an ‘exaggerated rationalism’ of precisely the kind Stegmueller indicts at other 
places (p. 199). Adding these results to the fact, mentioned above, that most 
criteria of comparison that survive in the case of incommensurable theories 
(in my sense) are arbitrary and give conflicting results we see that, while it is 
correct that the battle between paradigms leads to the victory of the theory 
that has the greater achievements on its side (p. 251), these achievements them- 
selves are not ‘objective’ features but depertd on changing criteria and changing 
decisions concerning an only vaguely defined balance of demands. 


7 MISCELLANEOUS AND CONCLUSION 


Stegmueller makes interesting comments on holism which go beyond the much 
less differentiated accounts of Kuhn and myself and which show the structural view 
at its best; he produces a criticism of Lakatos that is both ‘well meaning’ (p. 296) 
and penetrating; he criticises proliferation with the remark that it would demand 
‘superhuman efforts’ from the scientist (p. 309)"; he is against a ‘philosophy 
that makes prescriptions for science (p. 310); he tells me that nobody can know 


1 In a conference on correspondence rules that took place in Minneapolis in 1967 I made 
suggestions for comparing theories that are virtually identical with Stegmueller’s (cf. 
my [1970], p. 233). Since my [1958] ff. I have tried various pragmatic notions of rivalry 
[1958], p. 163; [1962], p. 94 f.; [1965al], p. 232 ff.; sections of my [1965b].) My sugges- 
tions may have flaws, but at least they see a problem unnoticed by Stegmueller. Steg- 
müller’s notion of reduction has been anticipated by Körner ((1971]). 

* Stegmueller asks for ‘social-political suggestions concerning the many failures who 
discovered too late that not every one gan be a second Newton and a second Einstein’ 
(p. 309). My suggestion: teach people that science is a difficult business, that while it 
may contain numerous jobs for floorsweepers, janitors, plumbers it can be advanced 
only by those who are capable of being creative schizophrenics, f.e. who are capable of 
using a plurality of views in their research. Besides, I think that Stegmueller exaggerates 
the difficulties of a pluralistic life style. Every citizen in a democracy, every parent who 
has been brought up in one ideology, faced another one at his working place, a third 
when discussing with his children, is an expert at using various views in his arguments 
and his thinking. 

3 Reply: prescriptions for science must be looked at with great care whether they are now 
produced by philosophers, or by the scientists themselves. Most of the time research ‘is 
guided by concrete decisions that are invented on the spot rather than by general rules 
that are argued out beforehand, though scientists, who are as much infected by norma- 
tivities as anybody else, will try to derive such decisions from general rules after the 
event, thus making their discoveries much less revolutionary than they actually are. On 
the other hand some new metaphysics leading to new general prescriptions may just be 
what is needed to advancesa particular area of knowledge. Such new ideas are often 
brought in from the outside, by dilettantes, or laymen. Even journalists, or businessmen 
can guide science in new directions as is shown by Alexander Marshack, or Schliemann. 
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that there is no correct method (p. 310),! that the demand for proliferation con- 
flicts with the slogan ‘anything goes’ (p. 309)? and that anarchy does not well 
combine with Hegel’s demand to make the ‘effort of the concept’ (p. 309).? I 
have gained from many of Stegmueller’s observations and I was able to clarify, 
or change, ideas which I had regaeded as more or less satisfactory. I also admit 
that the structural model cembined with Kuhn’s philosophy will enable us to 
look at science in a new way and to improve our understanding of it. But I still 
conjecture that future work in this area (and in every area) will gain more froma 


1 Like many other readers Stegmueller has a wrong idea of the content of my ‘anarchisin’. 
To locate the most common error, let us distinguish the following four positions: (4) 
old fashioned rationalism (Descartes, Kant, Popper, Lakatos; ancestor: the philosophy 
behind the apodictic laws of Exodus); (B) a form of rationalism that takes context and 
circumstances into account; ancestor: the philosophy behind the case laws of Exodus 
which is older than the apodictic philosophy and comes from Mesopotamia; it is also 
found in China, at the time of the oracle bones; (C) the view ascribed to me by many 
critics and; (D) my real view (ancestar: Kierkegaard, Concluding Unscientific Postscript). 

According to (A) it is rational (proper, in accordance with the will of the gods) to do 
certain things come what may (it is rational to prefer the more probable hypothesis, to 
avoid ad hoc hypotheses, self-inconsistent hypotheses, . degenerating research pro- 
grammes). Rationality is universal, independent of context, and it gives rise to equally 
universal rules. According to (B) it is rational to do certain things in certain conditions, 
other things in other conditions. Rationality is not universal, but there are universally 
valid conditional statements asserting what is rational in what contexts and there are 
corresponding rules. (C) says (a) that both absolute and conditional rules have their 
limits so that even a relativised rationality, when followed to the letter, may occasionally 
lead us astray and it infers, (6) that all methodological rules are therefore worthless. 
(D), which is my position, agrees with (Ca), but not with (Cd). It argues for a contextual 
account, but the contextual rules are not supposed to replace the absolute rules, they are 
to supplement them. I neither want to replace rules, nor do I want to show their worth- 
lessness; I rather want to increase the inventory of rules, and I want to suggest a different 
use for all of them, in the following manner. 

Usually it is assumed that rules determine the structure of research in advance, they 
guarantee its objectivity, they guarantee that we are dealing with rational action. By 
contrast I tegard each piece of research both as a potential instance of application for a 
rule and as a a test case of the rule: we may permit the rule to guide our research, #.¢. to 
exclude some actions and to mould others, but we may also permit our research to suspend 
the rule, or to regatd it as inapplicable even though all the known conditions demand 
its application. In making the latter decision we are not guided by any clear insight into 
the limitations of the rule, or the incompleteness of the conditions it contains, for the 
conditions are complete, they demand that the rule be applied, and there is as yet no 
reason to modify the rule. We are guided, rather, by the vague hope that working without 
the rule, or on the basis of a contrary rule we shall eventually find a new form of rationality 
that will provide a rational justification for the whole procedure: a researcher is an in- 
ventor of new theories, new instruments, new principles of all rhyme and reason because 
thyme and reason will be found only after one has moved a considerable distance without 
them. This is also what is meant by the slogan ‘anything goes’: there is no guarantee 
that the known forms of rationality will succeed and that the known forms of irrationality 
will fail. Any procedure, however ridiculous, may lead to progress, any procedure, 
however sound and rational, may get us stuck in the mud. 

2? Reply: these two demands belong to two different stages of my ‘development’. In my 
latest writings I use proliferation only to show how successful apparently absurd methods 
can be and how dangerous it is to rest content with plausibility. Cf. also my reply to 
Gellner ([1976]). . 

3 Reply: Hegel tries to set in motion concepts that have become petrified and to develop 
and present knowledge in their terms. I try to follow him. This is a much more difficult 
enterprise than the enterprise Stegmueller is engaged in, viz. to freeze moving concepts 
and use them to present equally frozen areas of knowledge. 
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healthy eclecticism than from commitment to a single point of view, however 
perfect. 


: PAUL FEYERABEND 
: University of California, Berkeley 
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MORITZ SCHLICK AND THE MIND-BODY PROBLEM* 


Schlick was born in 1882 and his Allgemeine Erkenntnislehre was first published 
in 1918. A considerably revised second edition came out in 1925; and this large 
work has now been beautifully translated by Albert Blumberg, with a short and 
helpful introduction by him and Herbert Feigl. (Both Blumberg and Feigl 
participated in the Vienna Circle.) 

Carnap came to Vienna in 1926 and Schlick met Wittgenstein in 1927. So 
this book expresses Schlick’s philosophical outlook as it was shortly before he 
came under two dominant influences which radically changed it. The book-is 
characterised by: an inside knowledge and high appreciation of natural science; 


* Review of Schlick, Moritz [1974]: General Theory of Knowledge. Translated by Albert E. 
Blumberg from the second German edition of Allgemeine Erkenntnislehre; with an 
Introduction by A. E. Blumberg and H. Feigl. Wien-New York: Springer-Verlag. 
Library of Exact Philosophy (ed: Mario Bunge) vol. rz. $32.80. Pp. xxvi+4ro. 

All italics within quotations are in the original. 
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a spirit of epistemolpgical optimism; a rejection of both psychologism and platon- 
ism in favour of a nominalist view of concepts; a correspondence theory of truth; 
the thesis that to know (Erkennen) something in the sense of understanding it is 
not to know (Kennen) it in the sense of being acquaintéd with it; a rejection of 
Machiar sensationalism in favour of a bold scientific realism; a rejéction of the 
category of synthetic a priors knowledge; a continuous concern with Kantian 
themes and problems, especially time, space, causality ‘and the unity of. con- 
sciousness (Kant students should read this book); and a determined attack upon 
the mind-body problem. Some of this is in line with the kind of positivism with 
which he was later so strongly associated. For instance, he held that there is 
‘an abyss’ between empirical statements and conceptual truths (p. 168) and that 
this dichotomy is exhaustive. And he held that ‘conceptual malformation’ has 
been responsible for ‘many pseudo-problems of a malignant character’ (p. 152). 
But much of the book runs counter to the spirit of positivism, as we shall see. 

The book was written as the opening volume in a series devoted to the natural 
sciences. Justifying its position thefe Schlick explained that ‘general epistemology 
is bound to take the scientific knowlege of nature as its point of departure’ (p. x) 
because epistemological principles are presupposed hy, and have to be elicited 
from, the sciences. He also held that a wrong-headed epistemology may 


violate the fundamental principles of all theory construction in physical 
science and fly in the face of empirical facts .. . [Thus] even for philosophical 
viewpoints there is a kind of confirmation.: or refutation through the facts 
of experience (pp. 211-12). 


(However he also said ‘my account of knowledge and truth is simply a definition 
and thus most certainly a purely analytic judgment’ (p. 384 n). It had to be either 
empirical or analytic.) 

Schlick had come to philosophy from theoretical physics; he wrote his doctoral 
dissertation under Max Planck, and published Space and Time in Contemporary 
Physics in 1917. As a philosophical hedonist (see his Problems of Ethics) he valued 
pure science for the sheer pleasure it gives to those who engage in it (pp. 100-1). 

The epistemological optimism I spoke of shines through at many places. Here 
are some examples: 


... there is no doubt that in the sciences we really do possess both know- 
ledge and advances in knowledge. This implies that the sciences have at their 
disposal a sure criterion for deciding when genuine knowledge is at hand, 
and in what it consists (p. 6). 


. all the sciences . . . are engaged in creating a great network of judgments 
designed to capture the system of facts. But the first and most important 
condition, without which the whole enterprise would make no sense, is 

* that each . . . judgment is true (p. 79). 


What is the criterion that assures us of truth?.. 

Since we know the nature of truth and are acquainted with its properties, 
we can also specify how the truth of judgments must make itself perceptible 
to us. 

The sciences long ago developed special mnethodsis: . of verification (p. 162). 
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If by the “essence” of things we understand something that is knowable 
at all, then surely the empirical sciences supply us with knowledge of the 
essence of nature of objects. In physics, for instance, Maxwell’s equations 
disclose to us the ‘‘éssence”’ of electricity, Einstein’s equations the essence 
of gravitation. With their help, we are able in principle to answer all ques- 
tions that can be raised with regard to these objects ç of nature (p. 242). 


This epistemological optimism was somewhat mitigated elsewhere in the book. 
The idea that scientific theories can be ‘established beyond doubt’ (p. g) became 
the idea that they ‘remain only hypotheses’ and that we must be content if their 
probability ‘assumes an extremely high value’ (p. 73), since induction ‘furnishes 
only probability’ (p. 110). (The book ends with a short section on inductive 
knowledge with which Schlick was not satisfied (p. xiii) and which I will pass over.) 
And the idea that science gives us the essence of things was similarly modified: 


After we have broken matter down into molecules, molecules into atoms, 
atoms into electrons, the question could still arise of distinguishing parts 
within an electron, and the cognitive process advancing in this direction 
ought never to be reckoned as absolutely completed. The question ‘How 
then is matter constituted?’ can never receive more Hn a provisional 
answer (p. 362). 


It is hardly surprising that there are, in a book over half a century old, things 
which will strike a modern reader as outdated, naive, or bizarre, In particular, 
he is likely to boggle at Schlick’s curious horror of negation and his Kant-like 
over-estimation of Aristotelian logic, and to be perplexed by his theory of truth. 
Before proceeding to the central topic of the book, an attempted resolution of 
the mind-body problem, I will first, in the interest of historicity, say something 
about these matters, but I beg the reader not to let them put him of what is in 
many ways a philosophical classic. 

Concerning negation Schlick wrote: 


. . negation occurs solely because of our faulty makeup. Consequently, it 
must be possible to do logic and science without taking negative judgments 
into account. Strictly speaking,-such judgments ought never to have found a 
place in pure logic . . . The edifice of science consists exclusively of positive 
statements (p. 64). g 


He protested that Brentano’s construal of ‘All men are mortal’ as ‘An immortal 
man does not exist’ is ‘an artificial construction which turns the natural state of 
affairs upside down’ (p. 42). I will suggest later that his theory of truth pushed 
him towards this exclusion of negative judgments. 

Concerning Aristotelian logic he declared that ‘in my opinion it still provides 
a means of presenting all logical relationships’ (p. 102); moreover, of the various 
moods of the syllogism a rigorous science needs only one, namely Barbara 
(p. 104)! How this was to be squared with his own emphasis on the importance 
for science of the derivation of singular predictions (p. 62), for instance of the 
return of a comet ‘at a definite point in time’ (p. 343), he did not explain. 
(Incidentally, in his own “example” of a syllogism in Barbara (p. 105), the minor 
premiss and the conclusion are singular statements.) A partial explanation for 
this extraordinary belief in the all-purpose sufficiency, at least within ‘rigorous 
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science’, of the Barbara syllogism is his belief, when he wrote this book, that the 
business of science is the discovery of universal laws. Apropos Kant’s doctrine 
of the permanence of substance or conservation of matter Schlick’ declared: 


The only thing that science seeks to retain as absolately immutable—and 
indeed must retain if it is te gain any knowledge at all—are laws . 
We have thus been ked back to the concept of law as the ultimate terra 


firma (p. 377). 


(It was only later, under the influence of Wittgenstein, that he felt constrained to 
concede that universal laws, being unverifiable, ‘do not seem to have the character 
of statements that are true or false’.) 

His theory of truth developed out of his essentially nominalist theory of con- 
cepts, which has an interest of its own. 

Schlick repudiated a psychologistic interpretation of ‘concepts and other 
logical structures’: 


It is an old truth that ideas or images are not the same as concepts, that 
mental activities are not the sdme as logical relationships. But only recently 
was this truth elaborated with full clarity—in the course of a feud against 


“psychologism”’ (p. 135). 


But he also repudiated a platonist interpretation of them. A mathematical line, 
for instance, is not laid up either in heaven or in consciousness. It'is ‘an unreal 
fiction’ (p. 135); likewise, the sequence of integers. Experience is a continuous 
flux; but each integer is discrete and fixed. But if integers, mathematical lines, 
logical relations, and other conceptual structures exist neither “‘out there” nor 
“in here”, what are they and how can they influence our thinking? 


We prefer to face the problem directly and calmly, prepared to affirm from 
the outset that there is actually nothing “there” except the real processes 
of cqnsciousness . . .. And we ask: How is it possible for real psychological 
relations to furnish precisely what purely logical relations provide unless 
the two are the same... (p. 141)? 


Schlick’s answer, as I understand it, was that we somehow distil out of the con- 
tinuous flow of experience certain artificial distillates which are non-mental, 
discrete, fixed and precise. 

Concepts gain their precision through definitions of various kinds, the two 
most important being implicit definitions, an idea due to Hilbert ‘that is of the 
greatest significance for epistemology’ (p. 33), and ‘concrete’ definitions. A 
concept is implicitly defined when it occurs in an axiom system which puts 
formal constraints on it. Concrete (or ostensive) definitions ‘set up the connec- 
tion between concepts and reality’ (p. 37). Wagueness can be eliminated: 


By a suitable choice it is always possible under certain circumstances to 
obtain an unambiguous designation of the real by means of the concept 


(p. 71). 
Schlick’s account of concepts was extensionalist and nominalist. A concept does 
not portray or describe the reality it designates: it merely designates it. ‘For the 
form of a sign is wholly independent of what it designates; all that is involved 
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is a reciprocal unique correlation’ (p. 359). The “meaning” of a concept (apart 
from any formal characteristics given it by implicit definition) is its extension. 
Schlick likened the relation of a concept to the objects it designates to that of a 
person’s name to that person (p. 61) or of a nation’s flag to that nation (p. 65). 
Concepts are in their turn designated by words {p. 39). (It seems to me that this 
turns concepts into interposed figments: why didehe ve go to the end of the 
nominalist road with Hobbes and say that worda directly designate objects?) 

A nominalist tends to be more at ease with either singular statements or 
universal statements than he is with particular statements. In the'case of ‘This 
animal is dangerous’ or ‘All dodos are dead’ the italicised phrase can be presumed 
to pick out either one definite object or one definite set of objects. But what does 
the italicised phrase in ‘Some men are stupid’ designate? Hobbes had said: ‘Of 
indefinite significance is . . . that name which has the word some, or the like added 
to it, and is called a particular name’ (Molesworth, i, pp. 21-2). And Schlick 
said of particular judgments: ‘for science they have only a provisional signifi- 
cance, as it were, and hence do not belong in a figorous system. These judgments 
subsume under a concept only a part of the objects correlated with a given con- 
cept, and do so in such a way as to leave undetermined which part of the whole 
set of objects is intended’ (p. 103). 

I now turn to Schlick’s theory of truth. For him the primary bearers of truth 
were neither thoughts nor sentences but judgments, a judgment being not a 
mental act of judging but the proposition or content assented to in such an act. 
He rejected the pragmatist equation of truth and verification as ‘totally incorrect’ 
(p. 165); if the predictions derived from a theory are borne out in experience, 
that makes it very probable that the theory is true, since purely accidental 
verifications are highly improbable (p. 164); but it does not constitute the theory's 
truth. If a judgment is true, then it is true irrespective of how far we have gone in 
establishing its truth. What holds for verifications holds also for the more 
dubious idea of self-evidence: a judgment which I find self-evident may in fact 
be true; but self-evidence, far from constituting truth, is only ‘a subjective 
psychical datum’ (p. 141). 

A judgment is said to be true if it agrees with the facts. But what does it mean 
for a judgment to agree with the-facts? Schlick’s aim was to provide a clear, 
definite, simple and straightforward anewer to this question. It cannot mean that 
the judgment is similar to the facts because ‘a judgment is something entirely 
different from that which is judged’; nor can it mean that the judgment is struc- 
turally isomorphic with the facts, for in the judgment “The chair is to the right 
of the table’, the term ‘chair’ is not to the right of the term ‘table’ (p. 61). Schlick’s 
answer was that a true judgment ‘uniquely designates a set of facts’ (p. 60). 

Now names designate objects but how can a judgment designate something? 
And how can facts be localized and individuated in a way that allows them to be 
designated or pointed at? Schlick did not answer these questions as explicitly 
as one would wish but I think that the following answer can be reconstructed 
from what he said. A judgment ties together certain concepts; concepts designate 
objects; objects in certain relations to one another constitute facts; so, a judgment 
uniquely designates a set of facts if its concepts designate one and the same set 
of objects. I cannot swear that this reconstruction is correct; but it is confirmed 
by what he said about false judgments, namely that they are ‘guilty of an 
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ambiguity of correspondence’ (p. 62). Thus the false judgment ‘Sir Isaac 
Newton is the author of The Compleat Angler’ is, presumably, “ambiguqus” in 
that its subject-term designates one object and its predicate-term another. 

This reconstruction is also confirmed by, and helps’ to explain, Schlick’s low 
view of negative judgments. He said that the primary role of the negative judg- 
ment ‘is simply to reject the corresponding positive judgment, to brand it as an 
ambiguous sign’ (p. 63). A negative “judgment”, it stems, was not regarded 
by him as a genuine judgment i in its own right. My reconstruction would explain 
why he took'this curious view. Take the sentence ‘Sir Isaac Newton is not the 
author of The Compleat Angler’. Its negative predicate-concept, far from designa- 
ting just Newton, designates every object except Sir Izaak Walton. If this sen- 
tence were counted as a genuine judgment it would presumably have to be 
branded as ambiguous; this can be avoided by construing it as an imperative to 
reject the corresponding positive judgment. 

Nevertheless, I have reservations about this reconstruction; for it yields a 
theory of truth which, although i? works with “The Morning Star is the Evening 
Star’ and perhaps, ‘Everything that has a kidney has a liver’, breaks down with 
“The Morning Star shines bright’ and ‘Everything that has a liver is mortal’. 

I turn now to Schlick’s scientific realism and to his criterion of reality. The 
basic question he posed at the outset was: 


What are these objects, these “things” or “facts” with which our signs are 
correlated in cognition? What is it that is designated? What is reality (p. 


172)? 


As a first approximation to a criterion of reality, Schlick considered: ‘the reat 
is that which has an effect (wirklich ist, was wirkt) (p. 181). But he found the idea 
of causal efficacy a shade too narrow: for example, it excludes as not real the last 
thought of a dying person (p. 182). But if, in response to this, we were to say 
that something is real if it stands in relations, whether causal or not, to other 
things, our criterion would become too wide: ‘Numbers are not real things: but 
no one denies that relations hold between them’ (p. 183). In the end Schlick 
adopted temporality as the criterion for reality: 


The temporality of all that is real is indeed a feature that can fulfill completely 
the role of the desired criterion (p. 188). 


This is in accord with his anti-platonism whereby timeless concepts are not real. 
However, as we shall see later, it creates a serious difficulty for his own kind of 
scientific realism. 

By this criterion, things that are real are so irrespective of whether they are 
perceived. A loaf of bread in the larder during the night is no less real than the 
same loaf seen on the table. And an atom that collides with another at a certain 
moment is no less real than the loaf of bread. ‘There is not the slightest difference 
between the two cases’ (p. 128). One argument which Schlick used for the exis- 
tence of a thing independent of any perceptions we may have of it was this. 
Let O be some familiar object, a pencil say, and let A,, Ag, Ag... be all the 
possible appearances it could present to all kinds of observers under all sorts of 
perceptual conditions. Then there are these three alternatives: 
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(1) O is equivalent to the set of all the A’s; 
(2) O is equivalent to a subset of the A’s (viz. those A’s that are actualised in 
perceptual experience); 
(3) O is not equivalent either to all or to some of the A’s. 


As to (1) (which roughly represents Russell’s 1914 view): Schlick objected that 
it puts the cart before ‘the horse by making what is real a function of mere 
possibilities, turning the pencil, for instance, into (among infinitely many other 
such shadowy things) the optical sensations a bee would get which circled it in a 
certain way under certain lighting conditions. As to (2), which makes the pencil 
pop into and out of existence as someone looks at.it and then looks away: 
Schlick regarded it as the absurdity to which (1) is reduced when stripped of 
unactualised possibilities. So we are left with (3). ` 

The object O, that is, the thing considered apart from perceptual experiences 
of it, Schlick called a ‘thing-in-itself’. (There was nothing pussyfooting about 
his terminology. He also called O a ‘transcendent object’.) And he rightly 
declared that it ‘surely follows from our criterion that things in themselves 
exist, since clearly many objects that must be thought of as temporally deter- 
mined are not among the immediately given’ (p. 195). 

Schlick’s scientific realism is an updated version of that of Galileo, Boyle, 
and Locke. The thing-in-itself is the thing as quantitatively described by physical 
science. As science progresses secondary qualities are stripped from things and 
put where they belong, in the observer’s consciousness. 


This process of eliminating qualities is at the heart of all advances in know- 
ledge in the explanatory sciences (p. 282). 


However, qualitative variations are not ‘simply ignored or discarded or neglected’ 
by science; their place is taken by quantitative variations which ‘run fully parallel 
with the former’ (p. 283). (This is the kernel of his solution of the psychophysical 
problem.) 

But a big difficulty attends Schlick’s scientific world-view. This difficulty can 
be indicated by contrasting Schlick’s position with Kant’s. For Kant, things-in- 
themselves lie outside experience and ,outside space; phenomena exist within 
a uniform and invariant spatial order which is somehow constituted by our 
spatial intuitions. For Schlick, things-in-themselves, things as understood by 
science, exist within a uniform and invariant spatial order; phenomena, things 
as we are acquainted with them in sensory experience, exist in subjective spatial 
orders which are peculiar to each individual percipient; moreover, an individual 
percipient has various spatial orders peculiar to different sensory domains: 
‘there is a visual space, a tactile space, a space of sensations of movement’ 
(p. 254). There can be no question of constituting one uniform and invariant 
order out of these very various subjective orders. 


The ordering of things-in-themselves is not only numerically distinct from 
the intuitive spatial ordering of our sensations, it is essentially different; 
transcendent objects cannot be localized in the space of intuition. For the 
objective ordering of things is unique, whereas there are many perceptual 


spaces (p. 254). 
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So the objects of. scientific investigation—atoms, light-waves, and the like— 
transcend experience but are nevertheless real, existing in a spatial continuum 
which also transcends experience. Much of what was said above about space 
applies mutatis mutandis to time: physico-chemical processes occur in a uniform 
temporal continuum which likewise transcends our subjective experience of 
duration, in which an ,‘hous creeps by slowly or rushes past, depending on 
whether it is filled with boring or interesting content’ (p. 248). 


Thus the conclusion seems inescapable that the realm of transcendent objects 
is extended in time and generally in space as well, that consequently: the 
doctrine of the subjectivity of space and time—given such wide recognition 
since Kant—is inconipatible with our results (p. 244). 


Now I come to the big difficulty. Would not a doctrine which declares that the 
realm of transcendent objects is a mere conceptual, man-made construct be no 
leas incompatible with those results? Would not a doctrine which imputes to the 
spatio-temporal framework an es$entially fictional or artificial character, impugn 
the reality of objects located within that framework? Yet Schlick held just such 
a doctrine. His temporality criterion had a strangely self-defeating character. 
It says that Mr Pickwick is not real whereas Charles Dickens is because only the 
latter existed in time, filled an interval in the temporal continuum. But to this 
temporal continuum itself, and also to the spatial continuum, it denies reality. 
Dickens, it seems, owes his reality to his location within a fictional framework. 


It is clear, however, that space and time cannot be declared real in the sense 
of our criterion itself; for time is not at a certain time, space is not at a 
certain place (p. 195). 


Thus the space of physics . . . is a wholly abstract structure, a mere scheme 
of ordering (p. 294). 


. . in the case of time too we must distinguish between intuitive time, con- 
cerning which empirical judgments may be made on the basis of psycho- 
logical investigations, and mathematical or objective time. The latter, like 
space, is a conceptual construction (p. 357). 


But it is in ‘mathematical or objective time’ that things-in-themselves have their 
temporal being; and a conceptual construction is ‘an unreal fiction’ (p. 135). It 
is as if Descartes had accommodated his vortices within a Liebniz-type spatio- 
temporal framework. 

Schlick attributed the main peculiarities of both Kant’s and Mach’s philo- 
sophies to ‘a desire to escape the psycho-physical problem’ (p. 200), and I think 
that the same can be said of his own account of time and more especially, of 
space. He attached ‘a quite special systematic importance’ to the psychophysical 
problem (p. xii); and my guess is that he de-ontologised physical space, turning 
it into a mere ‘ordering schema of the things-in-themselves’ (p. 261), in order to 
de-physicalise the things-in-themselves, and that he de-physicalised these in 
order to de-antithesise the body-mind ‘antithesis’ (see p. 291). 


From the vantage point we now have gained ^ . . the [psychophysical] 
problem is solved even before it can be raised . . .. However, in order to set 
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our minds completely at rest about the question, we must also uncover the 
source of the error that allowed the question of mind and-body to become 
such a tormenting problem (p. 289). 


He claimed that a ‘flawed concept of the physical is responsible for the contra- 
dictions in this great problem’ (p. 301). More specifically: ‘the spatial factor is 
somehow to blame for the genesis of the problem’ tps 362). 

Here again there is a striking structural similarity, plus asignificant difference 
of content, between Schlick’s position and Kant’s. Schlick quotes a passage from 
The Critique of Pure Reason (A 391) where Kant says that all the difficulties which 
beset the connection of our thinking nature with matter have their origin in 
an illicit dualism. (Schlick endorsed this: ‘Both pluralism and monism, each in 
its own way, contain a part of the truth. It is only dualism from which no good 
can be extracted’ (p. 333).) Dualists, Kant had said earlier, ‘regard extension, 
which is nothing but appearance, as a property of outer things that subsists 
even apart from our sensibility ... Yet the very space in which [objects in 
space] are intuited is nothing but a representation’ (A 385-6). Were one to over- 
look what Schlick said about the space of phẸsics, his position would seem very 
different from Kant’s. Kant, Schlick said, 


designates matter as appearance, and thus as mere representation, because 
it has spatial properties and spatiality is a property of intuitions or representa- 

+ tions. But the truth of the matter is that physical objects—the objects dealt 
with by physics—are non-intuitive; they are divested of all secondary 
qualities and of [intuitive] spatiality. For these latter all vary with the observer, 
they change with the angle of vision, the position, the lighting. But a physical 
object is the identical object that is independent of all such variation and is 
that to which these different perceptions are related. It does not possess 
intuitive spatiality (pp. 309-10). $ 


The appearances of my body to myself and to others are just some of the Æ’s 

related to this O. But O itself, my physical body, is a thing-in-itself; and it, 
unlike a Kantian thing-in-itself, exists not in limbo but in objective space. Then 
how is this O related to my mind? Do they interact? Well, there is no a priori 
reason why they should not: 


That things so different as body T mind could act on one another emed 
totally incomprehensible . . . 

` But even if the physical and the mental were in fact two different domains 
of the real, no difference in kind, however great, could constitute a serious 
obstacle to the existence of a causal relation between them. For we know of 
no law stating that things must be of the same kind in order to act on one 
another. On the contrary, ‘experience everywhere shows that the most 
disparate things . . . interact with one another (p. 301). 


Nevertheless, Schlick ruled out interaction. For he was a determinist who 
adhered to “the principle that causality in nature is closed’ (p. 303); and inter- 
actiénism implies that the causes of some bodily movements 


would in part have to be soyght in mental processes which cannot be 
represented by means of physical concepts; physical causality would have 
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gaps, and this would have a totally upsetting effect on the concept of natural 
law and on the formulation of laws of nature (p. 299). i . 


(For'my part, I think that Schlick was right to hold that interactionism implies 
physical indeterminism and wrong to conclude that interactionism is therefore 
untenable. However, although Re did not regard the principle of causality as a 
priori true (p. 74), he did ‘regard it as an empirical truth ‘obtained by induction 
from the totality of observed laws’ (p. 388). We should remember that this was 
written before the revolution in quantum mechanics.) 

With interaction excluded the choice is presumably between a non-ihter- 
actionist dualism and some kind of monism. Not wanting any kind of pre- 
established harmony theory, Schlick opted for monism. And this is where his 
doctrine of the merely conceptual character of the space of physics, together 
with his doctrine of the merely designatory role of the concepts of physics, came 
to his aid. The two of them in conjunction deprive “the physical” of its distinc- 
tively physical nature: it becomes merely ‘extra-mental’; and the distinction 
between ‘mental’ and ‘extra-mental’ is not one between two different domains 
of reality but between two different ways of conceptualising the same reality: 


by ‘physical’ [or ‘extra-mental’] we must understand not a special kind of 
reality, but a particular mode of designating reality .. . 

.. the expression ‘psychophysical parallelism’ is entirely suitable for charac- 
terizing our view that one and the same reality—namely, that which: is 
immediately experienced—can be designated both by psychological concepts 
and by physical ones (p. 310). 


Schlick made it quite clear that his parallelism was not like Spinoza’s meta- 
physical parallelism whereby thought and extension, although they are only 
aspects of one reality, are nevertheless essentially different aspects each of which 
is sui generis and self-sufficient. Schlick’s was really a linguistic parallelism: there 
is the language of psychology and the language of brain physiology and to any 
true statement formulated in the first there corresponds an equivalent statement 
in the second language. It was an early version of the identity-hypothesis: 


. in place of the dualistic assumption we introduce the much simpler 
hypothesis that the concepts of the natural sciences are suited for designating 
every reality including that which is immediately experienced. The resulting 
relation between immediately experienced reality and the physical brain 
processes is then no longer one of causal dependency but of simple identity 


(p. 299). 


Suppose, for the moment, that it is possible, at least in principle, for me to mark 
off and individuate each of the sixty mental states of one second’s duration that 
I went through during the last minute. Let us label these m4, mg, . . . Mago Then 
Schlick’s identity-hypothesis asserts that there are sixty brain-states of one 
second’s duration, say b1, bp, .. . beo, such that m, = b1, mg = bg, ... Man = Deo. 
And these strict identities mean that every property and relation of a mental 
state is also a property or relation of the corresponding brain state. : 

This is an existential hypothesis: the m’s are supposed to be “given” in 
consciousness but the corresponding b’s are only postulated. This means that 
the hypothesis is not empirically refutable, a feature of which Schlick took some 
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advantage. He discussed, without really countering, certain ‘arguments put 
forward by advocates of interaction . . . to the effect that a thoroughgoing cor- 
relation of quantitative concepts with ‘mental qualities is absolutely impossible’ 
(p. 314); but he was undeterred by these objections: 


For ‘he objections discussed above show only the inadequacy of A attempts 
made thus far to formulate a physiological hypothesis; they cannot establish 
that a physiological—in the final ae ee is in 
principle impossible (p. 318). 


Earlier i in the book, in a discussion of the unity of consciousness (something 
to which Kant, of course, had given the greatest importance), Schlick had him- 
self drawn attention to certain facts about conscidusness with which, it seems to 
me, his identity-hypothesis can hardly be reconciled. 

Perhaps the best way to indicate Schlick’s ideas, here, is by contrasting them 
with a view he rejected, namely psychological atomism. Hume said that a self 
or ‘T is ‘nothing but a bundle or collection ‘of different perceptions’ (Treatise, 
Selby-Bigge, ed., p. 252). He added: ‘All*perceptions are distinct. They are, 
therefore, distinguishable,’ and separable, and may be conceiv’d as separately 
existent, and may exist separately’ (ibid., p. 634). By contrast, Schlick held, 
and he was surely right, that a peculiar (he actually called it ‘indescribable’) 
interconnection, both between contemporaneous and successive perceptions, is 
essential to consciousness (p. 125). For let us try to follow through Hume’s 
idea. Let p4, pg, pg... be a “bundle” of distinct perceptions. Being separate 
existents, each of these could, logically, occur without the others. So let us sup- 
pose that p, occurs and then totally vanishes, then pa, and so on, so that there is 
- now a sequence of perfectly discrete perceptions occurring one after the other. 
Could we still say that this constitutes consciousness? Schlick’s answer was No; 
for the situation here would be the same, ontologically, if p, had occurred in 
one centre (say, an oyster) and then p, had occurred in another centre, and so 
on. (Schlick mentioned that ‘Wundt also remarks that a momentary “conscious- 
ness would have to be called an “unconscious” one’ (p. 125n).) Although he 
did not use the term ‘specious present’, Schlick endorsed the idea it stands for. 
‘The individual moments of cons¢iousness’, he wrote, ‘exist not for themselves 
but, as it were, for each other’ (p. ,r25). Let us try to make this a bit more 
definite with the help of a simple example. I am at an official dinner when the 
presiding officer rises, raps on the table, raises his glass, and says: “Ladies and 
gentlemen, let us drink a toast to Her Majesty the Queen.’ This performance 
takes about ten seconds. It would be misleading to say that during it I have a 
sequence of visual perceptions running alongside a sequence of oral perceptions; 
visual and oral perceptions fuse into one experience. As well as this integration 
of simultaneous perceptions from different senses, there is also integration through 
time. I do not first hear ‘Ladies and . . .’, then stop hearing that and start hearing 
‘gentlemen...’ The earlier perception lingers a while, merging into its succes- 
sors, and fading only gradually. I am in a way still vaguely “hearing” ‘Ladies 
and gentlemen’ when he gets to ‘Her Majesty . . .—it has not yet become some- 
thing which, like a remark heard over the soup, now needs an act of recall for 
me to bring it to mind again. 

All this is very familiar and was very well understood by Schlick; yet it seems 
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to me to provide a strong prima facie argument against his identity-hypothesis. 
In order to state that hypothesis I assumed, provisionally, that it is legitimate 
to operate with the idea of two parallel sequences, one of mental states and the 
other of brain-states, such that for each one-second segment of the first there is a 
contemporaneous segment of the second with which it can be correlated and, 
indeed, identified. Now in the course of a second a number of perfectly precise 
changes can be assumed to take place in the brain, théugh the number is so 
enormous that there can be no question of plotting them. In a chapter on “The 
Brain and the Unity of Conscious Experience’ in his [1973] Sir John Eccles, 
after mentioning Crawford’s finding that at least one-fifth of a second of cortical 
activity is the minimum required before a flash of light can be consciously 
detected, adds that, with respect to neuronal activity, one-fifth of a second, 


is very long indeed. The time for transmission from one nerve cell to an- 
other is no longer than 1/1o00th of a second; hence there could be a serial 
relay of as many as 200 symaptic linkages between nerve cells before a 
conscious experience is aroused. Many thousands of nerve cells would be 
initially activated, and each nerve cell by synaptic relay would in turn acti- 
vate many nerve cells. The immensity of this patterned spread throughout 
the neuronal pathways of the brain is beyond all imagining (p. 71). 


So on this side of the alleged equation we have, in the space of a second a (very) 
large number of determinate neuronal events. And on the other side? Well, if 
that one second of consciousness happened to include the perception of a flash, 
click, pinprick, or other momentary event, we might claim that a definite event, 
which can be individuated, occurred within it. But suppose it is the second in 
my consciousness during which the words ‘. . . a toast to Her . . .’ were uttered. I 
claim that no completed event occurred in my consciousness during that inter- 
val. 

Even on Schlick’s nominalist or designatory theory of concepts, for an identity- 
statement to be a candidate for being true it must be possible that the two terms 
in it designate the same object. This requirement is satisfied in Frege’s famous 
example: ‘The Morning Star’ designates one quite definite object, and “The 
Evening Star’ designates one quite definite object of an essentially similar type, 
and there is no a priori reason why the object designated by the first should not 
be the object designated by the second. But this requirement is not met by, say, 
“The Morning Star is the evening stillness’. The second term in this sentence, 
unlike the first, designates something diffuse and without sharp boundaries. 
Schlick’s own account of consciousness suggests to me that “The Morning Star 
is the evening stillness’ provides a closer analogue for the thesis that cortical 
changes are conscious experiences than does “The Morning Star is the Evening 
Star’. 

. But have I not over-reached myself? I said that his identity-hypothesis is not 
open to empirical refutation, being an existential hypothesis. Yet I have been 
arguing against it by appealing to certain empirical facts: to facts ascertained 
by brain scientists, and to facts to do with the unity of consciousness as described 
by Schlick which each of us can check in his own experience. Well, Schlick did 
not say what features, or aspects, or patterns of cortical changes are méntal 
processes; he said only that something in the brain is mental. And it remains 
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logically possible that the firings of the 10!° or so nerve-cells in the brain could 
generate some physical patterns having the peculiar properties which conscious- 
ness has, patterns of whose individual moments Schlick could have written, as 
he wrote of the individual moments of consciousness, that they ‘exist not for 
themselves but, as it were, for each other’. . 

Schlick conceded that the unity of consciousnegs, with its peculiar inter- 
connectedness, constitutes a prima facie argument for dualism: 


In one respect, . . . dualism might still seem to stand unrefuted. Mental 
: qualities have that special relationship which, as the interconnection of 
consciousness, has so often occupied us. And in this way they are dis- 
tinguished from-all other qualities... 9 
* Now this interconnection is indeed something quite special . . . [S]cience 
thus far does not possess quantitative concepts by which to designate the 
interconnection (p. 332). 


He also raised an objection against his ow:f view which, unlike the present 
objection, ‘stresses the simplicity of many experiences and contrasts this with 
the complicated character of the correlated physical processes’ (p. 320). Suppose 
that a sleeping person is half-awakened by, say, a steady drone which he drowsily 
hears for a while and then sinks back into sleep. Then he briefly has an essentially 
simple experience. But its physiological correlate 


is apparently extremely complex. ‘The physical processes . . . are enormously 
complicated. From among the innumerable cells of which the brain is 
composed, a goodly number go into action when a sensation takes place . . . 
And now the concept of a brain process . . . is supposed to designate a single 
quality, namely, this simple sound! Is this not a truly unsolvable contra- 
diction? This objection is so basic that there seems to be no escape from it 
(P. 320). 

Schlick sometimes posed objections more effectively than he dealt with them. 

He avoided this one by exploiting once more the existential nature of the identity- 

hypothesis: 


But we do not know which process is to be associated with a simple sensation 
as its physical correlate. Certainly the correlate is not the total brain process, 
but only some part of it . . . Thus it may be a very small partial process, one 
that is extremely simple (pp. 320-1). 


Indeed, Schlick leaned rather heavily, in this connection, on our present ignor- 
ance of the details of brain processes. He assumed that this ignorance will in due 
course be replaced by knowledge in accord with the identity-hypothesis: a time 
will come when the physical correlates not only of simple sensations but of the 
unity of consciousness itself will be found. “The problem of consciousness will 
then be solved’ (p. 332). 

Even where one disagrees with this book one has to admire its metaphysical 
boldness. There is a passage in it which has a certain piquancy in view of the 
militantly anti-metaphysical line which Schlick took later. In this passage he 
was discussing what he called ‘immanentism’ by which he meant the sensationalist 
empiricism of Avenarius and Mach. He began by saying: 
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There is no. doubt that a withdrawal into the immanence standpoint 
obviates and ‘makes unnecessary a whole series of philosophical struggles. 
Surely every serious thinker has at times felt the temptation to rid himself 
of tormenting problems by adopting the immanentist viewpoint (p. 198). 


The main problem obviated by this viewpoint is ‘the question of the relation 
between the mental and ‘the physical’: . 


We see clearly that it is just this problem before which philosophers have 
taken refuge in the fortress of immanence . . . Even if one of the most 
prominent representatives [Mach] of the view had not explicitly stated this 
to be the case, we could readily see that all forms of the immanence idea 
arise from a desire to escape the psychophysical problem (pp. 199-209). 


But at this time Schlick vigorously repudiated ‘immanentism’. It was a main 
theme of his that knowledge consists, not in being acquainted with something 
given, but in conceptual descriptipn and explanation. The need for conceptual 
knowledge is biologically rooted; an animal whose perceptions were merely 
sense-data would perish: ‘An animal must perceive prey as prey, an enemy as 
an enemy’ (p. 94). And he rejected what he called the ‘positivist desideratum’, 
which would restrict us to the bare elements of sensation, on the ground ‘that 
a meticulously rigorous execution of this program would unfortunately mean a 
total renunciation of knowledge’ (p. 198). 


J. W. N. WATKINS 
The London School of Economics 
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Exaasser, W. M. [1975]: The Chief Abstractions ôf Riology. Amsterdam: North- 
Holland. Paperback 45 Dfl. (US $18.75). Pp. xiv-+-261. 


Elsasser’s aim in this book is to provide an outline of a holistic approach to 
biology which avoids slipping into anything resembling mystical vitalism and 
which will provide powerful guidance for the development of a non-mechanistic 
biology. He provides a criticism of reductionism as well as examples of the sort 
of:research he considers particularly worthwhile. In my view neither of these 
sections are fully satisfactory. The criticism of reductionism appears to rely on 
a dubious theory of confirmation which is, to say the least, arguable. And 
although the attempt to indicate how different philosophical approaches to 
biology have implications for the content of the biological sciences is an interest- 
ing one, his actual examples do not seem to me likely to cause those impressed 
by the successes of mechanistic biology to revise their opinion of the fruitfulness 
of an organismic approach. However, Elsasser’s interest in this aspect of the 
debate, and in general his concentration on its implications for science rather 
than for more general philosophical concerns, distinguishes him from much! 
of the recent anti-reductionist philosophy of biology produced partly as a reaction 
to the reductionism of Monod [1970]. This book is designed to meet ‘the 
scientist’ on his or her own ground. 

Elsasser is primarily neither philosopher nor biologist but a physicist who 
(like Schrödinger, Delbrück and Bohr?) is impressed by the apparent difference 
of approach in biology, namely the feeling of remaining very much in the thick 
of the empirical data and the comparative absence of laws, or at least of those 
which intuitively resemble their counterparts in the physical sciences. However, 
unlike Schrédinger, he does not hold that the laws of physics can be additional 
biological laws, allowing the deduction of all biological phenomena; von 
Neumann’s 1933 proof of the impossibility of a consistent extension of quantum 
mechanics establishes, according to Elsasser, that the existence of such a law is 
inconsistent with modern physics. Elsasser’s own ‘organismic functions’ are not 
laws in this sense—indeed in his view the division between the inanimate and 
the animate is the division between those aspects of the world which can be 
fully subsumed under a deductive and mathematical schema, and those in which 
such a schema is necessary but inadequate. 

The book consists in the main of two interwoven themes. The first is his logical 
criticism of reductionism, which is basically a development of his previous 
books, though moving away from his earlier emphasis on arguments from in- 
formation theory. The second is a defence of the heuristic power of an organismic 
approach to biology, and includes sections on empirical research into indivi- 
duality, and on the mathematics of semi-definite constructs. It is these two themes 


1 For example, most of the contributions to Lewis (ed.) [1974] or, pre-dating Monod, 
Koestler and Smythies (eds.) [1969 Fk 
® Schrödinger [1948], Delbrück [1966]. 
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and their relafion to Mach’s philosophy of science which provides a background 
to them, that I will be discussing. The book also contains a section on ‘histgrico- 
philosophical precedents’ and two semi-technical appendices, on von Neumann’s 
automata models and on probabilistic induction. Elsasser’s criticism of reduc- 
tionism fs complex and I have to confess to finding his often repetitive presenta- 
tion of the arguments confusing &nd occasionally ambiguous. The following is 
therefore as much a rationdt reconstruction as a summary of his main argument. 

Elsasser’s claim is tHat biology is not fully reducible to physics—¢.e. it cannot 
be ‘dealt within terms of mechanisms and their information content only’ (p. 14) 
—because of the fundamentally different nature of the subject-matter of the two 
disciplines. The biological is characterised by a degree of complexity unknown 
in the inanimate world. Assume that we consider only classes which are both 
finite and discrete; we can calculate the number of possible configurations of a 
system of any given complexity. In the case of biological phenomena, for which 
this assumption seems plausible, the resulting number will be immense (an 
immense number being one whose logarithm is a large number) and the subset 
of those configurations compatible ‚with the laws of physics is still immense. 
The number of actual configurations to occur in the peal world is an immensely 
small proportion of this. This idea—the ‘principle of finite classes’—applies 
also to the inanimate world (e.g. statistical mechanics); its increased significance 
for biology derives from the fact that ‘the organism is specifically designed for 
the utilisation of the effects of contingencies’ (p. 26). The configuration of a 
particular organism should be interpreted as containing contingencies which 
operate within a set of mechanisms—the organism is not fully causally deter- 
mined and there is a connection between microlevel and macrolevel variation. 

This view of the role of contingencies suggests that it is inappropriate to deal 
with a ‘level’ involving unpredictable elements by means of statistical averaging 
processes, as, for example, ‘temperature’ is derived from averaging the behaviour 
of molecules. It is the variations between organisms which are of interest, not an 
average whjch could be created from them. The defence of this approach is in the 
demonstration of its heuristic power. However, the principle of finite classes 
suggests that such an averaging procedure is not only misguided but invalid: for 
given that our set is finite and that at all biological levels complexity (i.e. the 
number of ways in which a set can be inhomogeneous) is great, we will not in 
general be able to prepare homogeneous subsets and thus will not in general be 
able to assume indefinite repetition of experiments. ‘Even the idealised observer 
runs out of samples before a sufficient uniformity and homogeneity of samples is 
obtained’ (p. 28). Indefinite repeatability is the cornerstone of the Cartesian 
reductionist method and rests on the assumption that the subject-matter of physics 
and chemistry approximates an infinite homogeneous class—as this assumption 
is false for biology, Elsasser concludes that some fundamentally different method 
must be found. 

It seems to me that this line of argument can be criticised on two grounds, 
The first brings me to the third major theme of this book; the attempt to show 
that holism is a consequence of the application of Mach’s positivistic ideas to 
biology. Elsasser does not claim that this startling idea led to his own belief in 
holism, which seems to have come more from the strongly empirical and non- 
mathematical emphasis of the biological sciestces. However, positivist or opera- 
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tionalist ideas play an important part in his defence of organismic biology, in 
variqus ways, the most important being in his idea of the organismic function 
and its heuristic role, which I discuss later. It also provides Elsasser with his 
idea that indefinite repeatability of experiments is the cornerstone of ‘the 
cartesian method’, that is, the methodology of modern physical science. In- 
definite repeatability means more than that experiments should be inter- 
subjectively repeatable; it means that ‘the reliabMity of an experiment can 
be indefinitely increased by repetition’ (p. 28). It is thts which Elsasser claims 
‘underlies the ancient model of the clockwork universe’ and, onè assumes, the 
modern one. : 

Elsasser is not very explicit about his views on scientific method, but from 
quotes like the above he seems to hold the position that scientific theories in the 
physical sciences can be more or less verified by éxperimental results. Thus by 
showing that we cannot regard our experimental results as certain (or rather 
indefinitely reliable) in biology he claims to establish that this semi-justificationist 
framework is inappropriate to it. However,*if one takes the position that it is 
equally inappropriate to the physical sciences, then this distinction between the 
two breaks down. And of course it is perfectly possible to maintain belief in a 
‘clockwork universe’ while abandoning this idea of scientific method, in par- 
ticular that form of confirmation theory which holds that the truth of a statement 
about an experimental result can be corroborated to an arbitrary degree by 
repetition of the experiment. 

It is sometimes incorrectly assumed that Elsasser is arguing invalidly from 
our ignorance of particular mechanisms to their non-existence. But within a 
positivistic framework he can argue from untestability to non-existence, or at 
least to exclusion from science. However, my second objection is that inhomo- 
geneity does not seem to me as great a problem for testability as Elsasser takes 
it to be. 

If he wished to argue that meaningful empirical results can be derived only 
from subclasses which are homogeneous in all respects, then, given his argu- 
ments for the impossibility of a homogeneous biological class, it follows that there 
can be no biological (as opposed to purely chemical) experiments. This is clearly 
not Elsasser’s position. He implicitly recognises that a sample does not have to 
be totally homogeneous, but only so, with respect to those variables relevant to 
a particular experiment. In some areas of research this has presented problems 
which after much effort have been overcome—the production of genetically 
pure virus strains, for example. It may be that certain experiments will never in 
practice be possible as one cannot produce a sample which is sufficiently homo- 
geneous with respect to the relevant variables, but this is also the case in the 
physical sciences, cosmology, astronomy and, I imagine, Elsasser’s own special- 
ism geophysics would provide examples. Given that Elsasser provides no general 
criteria for a class to be ‘sufficiently’ homogeneous which could be seen to apply 
in some fundamentally different way to biology than to the physical sciences, it 
does not seem to provide the basis for a theoretical distinction between the two 
areas. 

‘Unlike most modern critics of reductionism, Elsasser is not content to argue 
that it is unsatisfactory only on metaphysical grounds, he also attempts to 
demonstrate that an alternative’ would actually produce better science. This 
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attempt to work out the implications of organismic biology for research is an 
original and interesting contribution to the general debate. He does this in 
various ways, by looking at existing research on individuality, the role of the 
organismic function and the mathematics of semi-definite constructs. 

The section on research into individuality is, in fact, meant not only as a 
demonstration of heuristic pewer, but also of his claim that organisms are 
‘designed for the utilisation of contingencies’. So in this section it is not enough 
for him to merely demonstrate the existence of variation, he must also show its 
significance for the organism. Likewise, it is not enough to show that there is 
research heing done into individuality, but that it is interesting and fruitful 
research. He seems to me to be unsuccessful in both respects. 

He outlines convincing evidence for an unexpected degree of individuality 
over wide areas and provides references to further work. Examples he discusses 
include anatomical variability (e.g. relative weights of organs, topological struc- 
ture of blood vessels in the hand), chemical variability, and the uniqueness of 
fingerprints. The range of variatiofi is often very striking, and he provides con- 
vincing arguments against ignoring it in certain situations—for example medical 
treatments which are highly successful on a few patients but have no effect on 
most may be abandoned as a result of considering only the effect on an ‘average’ 
patient; there may.be no treatment which will work on all patients, yet each 
patient might be cured by one of the treatments. But such arguments for the 
utility of investigating individuality do not establish that such individuality is 
necessarily of any value to the organism. Elsasser writes that ‘Nature seems to 
go to quite extraordinary lengths to demonstrate that organisms of the same 
class (e.g. species) could not possibly have originated in the course of a process 
that resembles what happens in an industrial assembly line’ (p. 43). But the 
somewhat platonistic form of crude mechanism that held that variation was the 
result of a ‘technological inaccuracy’ in the reproduction of a single blueprint 
had to be abandoned with respect to heritable variation with the acceptance of 
natural selection. Although within an evolutionary framework the existence of 
individuality is highly significant, the specific forms of the variations are generally 
not regarded as of theoretical interest. Elsasser makes no attempt to provide 
arguments for the significance of the actual form of the variation, or of non- 
heritable variations such as fingerprints, for theoretical biology. This seems to 
me a crucial weakness in his argument. 

This chapter seems to me no more successful in its other aim. To move 
from Elsasser’s promises of significant and exciting contributions to theoretical 
biology to, for example, six pages establishing the virtual impossibility of finding 
two people with identical fingerprints, is to suffer a distinct feeling of anti- 
climax. The few theories that are put forward in this section—for example that 
there is a connection between the structure of blood vessels in the hand and 
manual dexterity—remain at the level of empirical correlations. On the basis of 
the evidence presented in this chapter investigations inspired by a view of the 
importance of individuality seem to be concentrating on documenting its extent 
which, by themselves, seem to add little to our theoretical understanding. 

Chapter 4 consists of an examination of problems that have been particularly 
unresponsive to mechanistic approaches, and of suggestions as to how the 
‘organismic function’ might provide more Ropeful heuristic guidance. This 
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organismic function is variously defined: ‘The total influence of contingencies 
upon,the organism’ (p. 16) or ‘.. . those biological consequences of utter com- 
plexity which eventuate in phenomena not fully reducible to physics and 
chemistry’ (p. 74). Such functions ‘ represent a possible new form of order based 
on unpredi¢table fluctuations, that is, made possible by a physical indeterminacy 
(p. 89). They are ‘the quasi-mechanism . . . [of] a gradual diffusion, as it were, 
or the effects of some of an immense number of micréscopic patterns into higher 
levels of organisation’ (p. 84). Although the general principle is clear, I find it 
hard to extract from this book a precise picture of what the organismic function 
actually i is. Elsasser hopes to make this clearer as well as demonstrating its heuris- 
tic implications in his examples, of which the discussion of genetic determination 
(pp., 116-29) is the most detailed. In this case the problem is the nature of the 
connection between the transmission of the properties inscribed in DNA, that 
is, purely chemical properties, and the transmission of morphological charac- 
teristics (chemical as well as geometrical). Modern biology is remarkably silent 
on the nature of this connection—although specific chemical causes for variations 
in form have been suggested, there seems to be no serious hypothesis about the 
existence of form as such:.‘it is easy to understand that the replacement of one 
enzyme by another should in the process of embryonic development turn a blue 
eye into a brown one, but it is utterly obscure how any set of.enzymes plus rules 
for their appearance or cessation could be appropriate by itself to give rise to 
the morphological pattern of an eye.’ Yet morphological characteristics are, by 
and large, clearly heritable. 

Elsasser’s claim is that not only is nothing known of the mechanism of inherit- 
ance of morphological properties, but that ‘there is extensive evidence to the 
effect that such a mechanism does not exist’. However, Elsasser’s observation 
that the increase in quantity of DNA between a bacterium and mammalian 
cell is less than the corresponding increase in complexity hardly seems to qualify 
as ‘extensive evidence’ particularly given current views on the a of 
apparently redundant DNA. 

Elsasser claims that his approach, unlike reductionism, does sigue an 
answer, at least schematically, to the a ois of the inheritance of morphological 
properties. 


The organismic function maintains or regenerates morphological features, 
as the case may be, guided by the chemical relationships of the mechanistic 
skeleton. But there is no full, experimentally demonstrable storage of in- 
formation in the intervening time interval, such as a purely mechanistic 
device would require. Only part of the information is supplied by the 
mechanistic skeleton, the remainder is maintained or regenerated by the 
organismic function without continuity of information in a mechanistic 
sense, 


This takes place ‘even when the number of alternatives made possible by the 
missing information is immense’. Hereditary resemblance indicates a causal 
relationship but this causality cannot be ‘reduced’ to ‘the ordinary causality 
of physics’, it is ‘epigenetic causality’. It seems to me that there are three possible 
interpretations of the organismic function. It could be (1) the feeding in of a 
degree of indeterminacy into the mechanisms of the organism, (2) a direction or 
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pull exerted on physically indeterminate microlevel events towards particular 
macrolevel phenomena, or (3) an uninterpreted correlation between ap un- 
predictable microlevel and a (mainly) predictable macrolevel, which should not 
be regarded realistically. 

Elsasser’s idea of the organismic function must contain at least (1) above, 
as he speaks of the ‘devices .. . themical in nature, that release the organismic 
function’ (p. 91). However#it must be more than this. If'it were merely a way of 
allowing microlevel indeterminacies to be amplified in such a way that they have 
macrolevel consequences, this could be achieved within a mechanistic frame- 
work. Schrédinger’s famous thought experiment provides just such a mechan- 
ism. And if the organismic function can achieve such feats as the maintenance 
of information in non-physical form it would seem to require rather more unysual 
properties. If it is to be regarded realistically it would seem that we need 
interpretation (2) above, that it is some sort of vital force, or guiding or teleo- 
logical principle. But this i is clearly not what Elsasser wants (for example, p. 89). 
So we are left with the position that the organismic function is a formal connec- 
tion between levels—a sort of epigtemological emergentism which he sees as 
resulting from the application of Mach’s positivistic,approach to biology. 

If one accepts Elsasser’s view on the importance of Mach’s ideas for the 
development of physics, it seems to me this only highlights the difficulties for 
Elsasser’s application of them. It is precisely those approaches which Mach’s 
methodology suggested for physics which are to be abandoned for biology. We 
should stop demanding mathematisation and quantification—logic is the langu- 
age of biology (p. ix) and it seems that we cannot expect our organismic theories 
to allow the deduction of accurate testable predictions. The two levels connected 
by the organismic function are connected in a non-rigorous way—Elsasser 
admits the possibility of an algorithm to express the organismic function, but 
clearly considers it remote (pp. 206, 215). Certainly no suggestion is given as to 
what it might look like. So having used positivism to argue that we cannot reduce, 
say, morphological characteristics to chemical ones, he is left with an organismic 
function which offers no more than teleological-sounding description (surely 
to say that information is stored non-physically is to indicate a problem for 
biology, not a solution?). Elsasser agrees many biological phenomena can and 
should be explained mechanistically. So his heuristic guidance comes down to 
this; when faced with a problem not immediately amenable to a mechanistic 
solution one should appeal to ‘an implicit form of organisation which . . . cannot 
be made fully explicit’ (p. 215). If this cannot be expressed in a rigorous manner 
it is a retreat from scientific explanation into untestable speculation. 

To attempt to argue against a deterministic model in favour of one in which 
contingencies play a crucial role in determining higher level properties of a 
system seems to me interesting and intuitively quite plausible, although I do 
not find his evidence for the model’s heuristic power very convincing. His claim 
that such a programme should be pursued outside a mechanistic framework 
rests on a logical attack on reductionism which I criticised above, and on the 
organismic function, which is inadequate if not counter-productive as a demon- 
stration of its heuristic implications. i 


. ALLISON QUICK 
London School of Economics 
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‘These three books are concerned with a common topic, the nature of space and 
time and associated philosophical problems. Yet it would be hard to find three 
such diverse tomes for style, attitude and even those aspects of their subject 
matter which they emphasise. 

Zwart’s book is the most straightforward and most readable. He has a self- 
assured and confident style and dismisses contrary views with an easy wave of 
the hand (and an argument which he clearly considers impregnable) no matter 
from what authority or genius they emanated. Nonetheless, he does place 
before the reader a cogent case for a relational view of time; where time is 
considered as a generalised relation of ‘before-and-after’ between phenomena. 
He considers the before-and-after concept as primitive and ‘elementary and 
one of his most interesting chapters deals with the ontogeny of the recognition 
of this concept and the important role of sounds and rhythm in this regard. 

Since the direction of time is implicit in his definition of it, he dismisses 
time-reversal as meaningless. However, he also offers strong logical arguments 
for rejecting suggestions that time can flow in different directions for different 
observers or change direction for a particle such as an electron in Feynman 
di . 

After establishing his basic thesis about the meaning of time, he considers 
a number of associated problems—causality, the direction of natural processes, 
entropy, Zeno’s paradoxes, etc. He devises an ingenious solution of the para- 
doxes by suggesting that time and space exist in the form of quantum-like 
intervals with blurred edges whose dimensions and properties are directly 
derived from Heisenberg’s uncertainty principle. 

His attitude to Special Relativity and time-dilatation reflects his characteristic 

‘common-sense’ approach; for him time-dilatation must have its origin in some 
form of absolute motion; he sees it as being intelligible in terms of motion 
relative to the system of,fixed stars. He is apparently oblivious of the twentieth- 
century discovery of an expanding universe of galaxies outside our Milky Way. 
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Hence he dismisses the" possibility of a universal measure of time without 
mention of the notion of cosmic time whose possible implications go beyond 
that of a universal measure associated with Hubble’s Law. 

In spite of this lacuna in Zwart’s thesis, his book offers an enjoyable and 
interesting account of an approach to the concept of time which would be shared 
by many scientists and philosephers but perhaps not so clearly articulated and 
defended by them. It is an Expensive book but nevertheless should be in libraries 
and read by students and practitioners of science and philosophy. 

In line with his approach Zwart criticises the attempts (e.g. of Griinbaum 
and Smart) to develop a tenseless approach in discussions of time. Hinckfuss’s 
book offers an essay towards a complete and self-consistent version of this 
approach. Like Zwart, he is not frightened to criticise and disagree with. the 
authorities, whether they be his philosophical opponents or his supporters apd 
mentors. However, his style is more taciturn than Zwart’s and his reasoning 
more closely argued. 

He believes that all description$ of natural phenomena involving ‘space’ or 
‘time’ can be ontologically and/or theoretically reduced to descriptions involving 
only the phenomena and relations between them; sọ his approach might also 
be considered as a relational viewpoint, but it differs radically from Zwart’s 
in its emphasis on-the need to remove the illusion of a ‘flow’ ‘of time by the 
employment of ‘detensed’ descriptions. He confronts two severe obstacles to 
a satisfactory expression of this programme: the relativity of simultaneity 
(and hence the lack of a unique ‘present’ implied by Special Relativity), and the 
same theory’s approach to the velocity of light as constant with respect to 
abstract reference frames. His response to these problems is to query the 
validity of Special Relativity and to propose a modified Ritz hypothesis regarding 
light-propagation. 

Hinckfuss’s book does not have the general appeal of Zwart’s, but it should 
be of interest to specialists such as linguistic philosophers and logicians. 

Space, Time and Geometry, edited by Patrick Suppes is a compendium of the 
Synthése Library. It is a collection of eighteen articles of which fourteen have 
already appeared in 1972 in a double issue of the journal Synthèse, and another 
one has been published elsewhere; seven of: these deal in diverse ways with 
causality, and the rest with various aspects of the geometry of space and time. 

Some of the articles are very good, for example Robert Palter’s on Kant’s 
formulation of the laws of motion and P. J. Zwart’s summary of the basic ideas 
in his book. Clark Glymour’s contribution on Topology, cosmology and con- 
vention is an important one but is seriously marred by the misprinting (on 
PP. 195, 197, 209, and 210) of much of his mathematical argument. 

Most of the articles are highly specialised and would appeal only to a very 
limited audience. When one considers that they have nearly all previously 
appeared together in convenient form, one wonders why large and expensive 
books like this are published at the present time. 

S. J. PROKHOVNIK 
University of New South Wales 
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